Covariate Shift by Kernel Mean Matching

author: Arthur Gretton, Max Planck Institute for Biological Cybernetics, Max Planck Institute
published: Jan. 19, 2010,   recorded: December 2009,   views: 414
Categories
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Kernel approaches to covariate shift
0:38 Transfer learning and covariate shift -1
1:49 Transfer learning and covariate shift -2
2:06 Transfer learning and covariate shift -3
2:28 A toy example -1
3:06 A toy example -2
3:23 A toy example -3
3:40 The solution procedure -1
4:26 The solution procedure -2
5:04 The solution procedure -3
5:44 Importance weighting -1
6:10 Importance weighting -2
6:24 Importance weighting -3
7:01 Importance weighting -4
7:43 Importance weighting -5
8:19 Importance weighting -6
9:11 Importance weighting -7
9:37 Alternatives to density estimation -1
11:00 Alternatives to density estimation -2
13:10 maximum mean discrepancy
13:21 Function Showing Difference in Distributions -1
14:06 Function Showing Difference in Distributions -2
14:53 Function Showing Difference in Distributions -3
15:50 Function Showing Difference in Distributions -4
16:20 Function Showing Difference in Distributions -5
16:32 Function Showing Difference in Distributions -6
16:45 Function Showing Difference in Distributions -7
17:29 Function Showing Difference in Distributions -8
17:34 Function Showing Difference in Distributions -9
17:55 Function Showing Difference in Distributions -10
18:22 Function Showing Difference in Distributions -11
18:52 Transfer learning using maximum mean discrepancy
19:05 Transfer learning by KMM -1
19:10 Transfer learning by KMM -2
20:12 Transfer learning by KMM -3
21:15 Transfer learning by KMM -4
21:48 Transfer learning by KMM -5
22:14 Transfer learning by KMM -6
22:49 Transfer learning by KMM -7
23:17 Transfer learning by KMM -8
24:39 Transfer learning by KMM -9
24:52 Transfer learning by KMM -10
25:02 Transfer learning by KMM -11
25:34 Reweighting by classification -1
26:52 Reweighting by classification -2
29:04 Experiments
29:10 Breast Cancer data -1
30:16 Breast Cancer data -2
31:19 Breast Cancer data -3
32:11 Toy example revisited
33:17 Breast Cancer data -3
33:26 Large scale experiments -1
34:56 Large scale experiments -2
35:06 Large scale experiments -3
35:26 Large scale experiments -4
35:30 Large scale experiments -5
35:57 Large scale experiments -3
36:37 Further work: model selection -1
38:00 Further work: model selection -2
41:13 Summary
42:34 Acknowledgements

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

Description

Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this goal by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing kernel Hilbert space). This approach does not require distribution estimation. Instead, the sample weights are obtained by a simple quadratic programming procedure. We first describe how distributions may be mapped to reproducing kernel Hilbert spaces. Next, we review distances between such mappings, and describe conditions under which the feature space mappings are injective (and thus, distributions have a unique mapping). Finally, We demonstrate how a transfer learning algorithm can be obtained by reweighting the training points such that their feature mean matches that of the (unlabeled) test distribution. Our correction procedure yields its greatest and most consistent advantages when the learning algorithm returns a classifier/regressor that is "simpler" than the data might suggest. On the other hand, even an ideal sample reweighting may not be of practical benefit given a sufficiently powerful classifier (if available).

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: