Characterization of Linkage Based Clustering
published: Jan. 19, 2010, recorded: December 2009, views: 3587
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
There are a wide variety of clustering algorithms that, when run on the same data, often produce very different clusterings. Yet there is no principled method to guide the selection of a clustering algorithm. The choice of an appropriate clustering is, of course, task dependent. As such, we must rely on domain knowledge. The challenge is to communicate such knowledge between the domain expert and the algorithm designer. One approach to providing guidance to clustering users in the selection of a clustering algorithm is to identify important properties that a user may want an algorithm to satisfy, and determine which algorithms satisfy each of these properties. Clustering users can then utilize prior knowledge to determine the properties that make sense for their application.
Ultimately, there would be a sufficiently rich set of properties that would provide detailed enough guidelines for a wide variety of clustering users. For a property to be useful, a user needs to be able to easily determine the desirability of the property. Such a description of clustering algorithms would yield principled guidelines for clustering algorithm selection by answering a series of simple questions. Bosagh Zadeh and Ben-David  make progress in this direction by providing a set of abstract properties that characterize single linkage. In this work, we give another result in the same direction by characterizing a family of clustering algorithms. These are initial steps toward the ambitious program of developing broad guidelines for clustering algorithm selection.
Linkage-based clustering is one of the most commonly-used and widely-studied clustering paradigms. We provide a surprisingly simple set of properties that uniquely identify linkage-based clustering algorithms. Our characterization highlights how linkage-based algorithms compare to other clustering algorithms.
Combining previously proposed properties with our newly proposed ones, we show how these properties partition the space of commonly-used clustering algorithms. Specifically, we show which of these properties are satisfied by common linkage-based, centroid-based, and spectral clustering algorithms. We hope that this analysis, as well as our characterization of linkage-based clustering, will provide useful guidelines for users in selecting clustering algorithms.
Download slides: nipsworkshops09_loker_clb_01.pdf (103.1 KB)
Download slides: nipsworkshops09_loker_clb_01.ppt (1.2 MB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !