On a Theory of Similarity Functions for Learning and Clustering

author: Avrim Blum, School of Computer Science, Carnegie Mellon University
published: July 30, 2009,   recorded: June 2009,   views: 557
Categories

Slides

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Bibliography

Description

Kernel methods have become powerful tools in machine learning. They perform well in many applications, and there is also a well-developed theory of what makes a given kernel useful for a given learning problem. However, this theory requires viewing kernels as implicit (and often difficult to characterize) maps into high-dimensional spaces. In this talk I will describe work on developing a theory that just views a kernel as a measure of similarity between data objects, and describes the usefulness of a given kernel (or more general similarity function) in terms of fairly intuitive, direct properties of how the similarity function relates to the task at hand, without need to refer to any implicit spaces. I will also talk about an extension of this framework to learning from purely unlabeled data, i.e., clustering. In particular, one can ask how much stronger the properties of a similarity function should be (in terms of its relation to the unknown desired clustering) so that it can be used to cluster well: to learn well without any label information at all. We find that if we are willing to relax the objective a bit (for example, allow the algorithm to produce a hierarchical clustering that we will call successful if some pruning is close to the desired clustering), then this question leads to a number of interesting graph-theoretic and game-theoretic properties that are sufficient to cluster well. This work can be viewed defining a kind of PAC model for clustering. (This talk based on work joint with Maria-Florina Balcan, Santosh Vempala, and Nati Srebro).

See Also:

Download slides icon Download slides: mlss09us_blum_tsflc_01.ppt (1.5┬áMB)


Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: