Multi-View Dimensionality Reduction via Canonical Correlation Analysis

author: Sham M. Kakade, Microsoft Research New England, Microsoft Research
published: Dec. 20, 2008,   recorded: December 2008,   views: 5750


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


We analyze the multi-view regression problem where we have two views (X1,X2) of the input data and a real target variable Y of interest. In a semi-supervised learning setting, we consider two separate assumptions (one based on redundancy and the other based on (de)correlation) and show how, under either assumption alone, dimensionality reduction (based on CCA) could reduce the labeled sample complexity. The basic semi-supervised algorithm is as follows: with unlabeled data, perform CCA; with the labeled data, project the inputs onto a certain CCA subspace (i.e. perform dimensionality reduction) and then do least squares regression in this lower dimensional space. We show how, under either assumption, the number of labeled samples could be significantly reduced (in comparison to the single view setting) - in particular, we show how this dimensionality reduction only introduces little bias but could drastically reduce the variance. The two assumptions we consider are a redundancy assumption and an uncorrelated assumption. Under the redundancy assumption, we have that the best predictor from each view is roughly as good as the best predictor using both views. Under the uncorrelated assumption, we have that conditioned on Y the views X1 and X2 are uncorrelated. We show that under either of these assumptions, CCA is appropriate as a dimensionality reduction technique. We are also in the process of large scale experiments on word disambiguation (using theWikipedia, with the disambiguation pages as helping to provide labels). This work presents extensions of ideas in Ando and Zhang [2007] and Kakade and Foster [2007].

See Also:

Download slides icon Download slides: lms08_kakade_mvdr_01.pdf (136.1┬áKB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: