Correlation Clustering: From Theory to Practice
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Correlation clustering is arguably the most natural formulation of clustering. Given a set of objects and a pairwise similarity measure between them, the goal is to cluster the objects so that, to the best possible extent, similar objects are put in the same cluster and dissimilar objects are put in different clusters. As it just needs a definition of similarity, its broad generality makes it applicable to a wide range of problems in different contexts, and in particular makes it naturally suitable to clustering structured objects for which feature vectors can be difficult to obtain. Despite its simplicity, generality and wide applicability, correlation clustering has so far received much more attention from the algorithmic theory community than from the data mining community. The goal of this tutorial is to show how correlation clustering can be a powerful addition to the toolkit of the data mining researcher and practitioner, and to encourage discussions and further research in the area. In the tutorial we will survey the problem and its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. We will motivate the problems and discuss real-world applications, the scalability issues that may arise, and the existing approaches to handle them.
Download slides: kdd2014_bonchi_garcia_soriano_liberty_clustering.pdf (6.4 MB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !