Clustering without any subjective similarity information
published: July 20, 2010, recorded: June 2010, views: 202
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Consider the task of clustering university web pages based on the graph of links between these pages. Can clusters of "functionally similar" pages be detected from just this link structure? Note that this is a clustering task in which one starts without any prior knowledge of any similarity or distance measure between the domain elements. All the information in the input comes as objective, observed, binary relations among the objects. These relations are not similarity links. For example, the cluster of professors pages have very internal links, whereas the cluster of service pages have lots of internal links. What we are looking for are clusters whose members share similar link patterns with respect to the other clusters. We propose a formal model for such clustering tasks. Our model is based on an objective function that measures the homogeneity of between-clusters links. I shall discuss the computational complexity of finding a clustering with minimal objective cost and describe some hardness results as well as efficient approximation algorithms. The talk is (partly) based on work with Sharon Wulff.
Download slides: icml2010_ben_david_cwsi_01.pdf (209.2 KB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !