10 Year Best Paper: Combining Labeled and Unlabeled Data with Co-Training

author: Jude W. Shavlik, University of Wisconsin - Madison
published: July 24, 2008,   recorded: July 2008,   views: 10429


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


We consider the problem of using a large unlabeled sample to boost performance of a learning algorithm when only a small set of labeled examples is available. In particular we consider a problem setting motivated by the task of learning to classify web pages in which the description of each example can be partitioned into two distinct views. For example the description of a web page can be partitioned into the words occurring on that page and the words occurring in hyperlinks that point to that page We assume that either view of the examplewould be sufficient for learning if we had enough labeled data but our goal is to use both views together to allow inexpensive unlabeled data to augment a much smaller set of labeled examples. Specically the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view and then each algorithms predictions on new unlabeled examples are used to enlarge the training set of the other. Our goal in this paper is to provide a PAC style analysis for this setting and more broadly a PAC style framework for the general problem of learning from both labeled and unlabeled data. We also provide empirical results on real web page data indicating that this use of unlabeled examples can lead to significant improvement of hypotheses in practice Boltzmann Machines (RBMs) have been developed for a large variety of learning problems. However, RBMs are usually used as feature extractors for another learning algorithm or to provide a good initialization for deep feed-forward neural network classifiers, and are not considered as a stand-alone solution to classification problems. In this paper, we argue that RBMs provide a self-contained framework for deriving competitive non-linear classifiers. We present an evaluation of different learning algorithms for RBMs which aim at introducing a discriminative component to RBM training and improve their performance as classifiers. This approach is simple in that RBMs are used directly to build a classifier, rather than as a stepping stone. Finally, we demonstrate how discriminative RBMs can also be successfully employed in a semi-supervised setting.

See Also:

Download slides icon Download slides: icml08_shavlik_clud_01.ppt (3.4┬áMB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Xiaohui Guo, September 23, 2011 at 8:21 a.m.:

add to favorites

Write your own review or comment:

make sure you have javascript enabled or clear this field: