A theory of similarity functions for learning and clustering
published: Sept. 7, 2007, recorded: September 2007, views: 8993
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Kernel methods have proven to be very powerful tools in machine learning. In addition, there is a well-developed theory of sufficient conditions for a kernel to be useful for a given learning problem. However, while a kernel function can be thought of as just a pairwise similarity function that satisfies additional mathematical properties, this theory requires viewing kernels as implicit (and often difficult to characterize) maps into high-dimensional spaces. In this talk I will describe a more general theory that applies to more general similarity functions (not just legal kernels) and furthermore describes the usefulness of a given similarity function in terms of more intuitive, direct properties of the induced weighted graph. An interesting feature of the proposed framework is that it can also be applied to learning from purely unlabeled data, i.e., clustering. In particular, one can ask how much stronger the properties of a similarity function should be (in terms of its relation to the unknown desired clustering) so that it can be used to *cluster* well. Investigating this question leads to a number of interesting graph-theoretic properties, and their analysis in the inductive setting uses regularity-lemma type results of [FK99,AFKK03]. This work is joint with Maria-Florina Balcan and Santosh Vempala.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !