A Scalable Framework for Discovering Coherent Co-clusters in Noisy Data
published: Aug. 26, 2009, recorded: June 2009, views: 379
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Clustering problems often involve datasets where only a part of the data is relevant to the problem, e.g., in microarray data anal- ysis only a subset of the genes show cohe- sive expressions within a subset of the con- ditions/features. The existence of a large number of non-informative data points and features makes it challenging to hunt for co- herent and meaningful clusters from such datasets. Additionally, since clusters could exist in different subspaces of the feature space, a co-clustering algorithm that simul- taneously clusters objects and features is of- ten more suitable as compared to one that is restricted to traditional “one-sided” clus- tering. We propose Robust Overlapping Co- Clustering (ROCC), a scalable and very ver- satile framework that addresses the problem of efficiently mining dense, arbitrarily posi- tioned, possibly overlapping co-clusters from large, noisy datasets. ROCC has several de- sirable properties that make it extremely well suited to a number of real life applications. 1
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !