Covariate Shift by Kernel Mean Matching
published: Jan. 19, 2010, recorded: December 2009, views: 9082
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this goal by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing kernel Hilbert space). This approach does not require distribution estimation. Instead, the sample weights are obtained by a simple quadratic programming procedure. We first describe how distributions may be mapped to reproducing kernel Hilbert spaces. Next, we review distances between such mappings, and describe conditions under which the feature space mappings are injective (and thus, distributions have a unique mapping). Finally, We demonstrate how a transfer learning algorithm can be obtained by reweighting the training points such that their feature mean matches that of the (unlabeled) test distribution. Our correction procedure yields its greatest and most consistent advantages when the learning algorithm returns a classifier/regressor that is "simpler" than the data might suggest. On the other hand, even an ideal sample reweighting may not be of practical benefit given a sufficiently powerful classifier (if available).
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !