A Hierarchical Information Theoretic Technique for the Discovery of Non Linear Alternative Clusterings
published: Oct. 1, 2010, recorded: July 2010, views: 2822
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Discovery of alternative clusterings is an important method for exploring complex datasets. It provides the capability for the user to view clustering behaviour from different perspectives and thus explore new hypotheses. However, current algorithms for alternative clustering have focused mainly on linear scenarios and may not perform as desired for datasets containing clusters with non linear shapes. Our goal in this paper is to address this challenge of non linearity. In particular, we propose a novel algorithm to uncover an alternative clustering that is distinctively different from an existing, reference clustering. Our technique is information theory based and aims to ensure alternative clustering quality by maximizing the mutual information between clustering labels and data observations, whilst at the same time ensuring alternative clustering distinctiveness by minimizing the information sharing between the two clusterings. We perform experiments to assess our method against a large range of alternative clustering algorithms in the literature. We show our technique's performance is generally better for non-linear scenarios and furthermore, is highly competitive even for simpler, linear scenarios.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !