event thumbnail image
Sessions

A Scalable Framework for Discovering Coherent Co-clusters in Noisy Data

author: Meghana Deodhar, The University of Texas at Austin

Description

Clustering problems often involve datasets where only a part of the data is relevant to the problem, e.g., in microarray data anal- ysis only a subset of the genes show cohe- sive expressions within a subset of the con- ditions/features. The existence of a large number of non-informative data points and features makes it challenging to hunt for co- herent and meaningful clusters from such datasets. Additionally, since clusters could exist in different subspaces of the feature space, a co-clustering algorithm that simul- taneously clusters objects and features is of- ten more suitable as compared to one that is restricted to traditional “one-sided” clus- tering. We propose Robust Overlapping Co- Clustering (ROCC), a scalable and very ver- satile framework that addresses the problem of efficiently mining dense, arbitrarily posi- tioned, possibly overlapping co-clusters from large, noisy datasets. ROCC has several de- sirable properties that make it extremely well suited to a number of real life applications. 1

You might be experiencing some problems with Your Video player.
Slides
0:00 Robust Overlapping Co-Clustering Experimental Results A Scalable Framework for Discovering Coherent Co-clusters in Noisy Data
0:14 Table of contents
0:46 Small Clusters in Large Datasets
2:02 Clustering Challenges
2:50 Related Work (1)
3:05 Related Work (2)
3:18 Related Work (3)
3:36 Robust Overlapping Co-Clustering
3:53 ROCC: Key Idea
5:06 Distinguishing Features
6:13 Problem Definition (step 1)
7:10 Problem Definition (Objective function)
8:02 Problem Definition (step 2)
8:42 Approximation Schemes
9:42 ROCC Meta-Algorithm
11:14 Addressing the Local Minimum Problem
12:18 Generative Model Intuition
12:58 Results on Synthetic Datasets
13:38 Matrix Reconstruction
14:06 Microarray Datasets
14:50 Results on Microarray Data
16:38 Simultaneous Feature Selection and Clustering
18:03 Concluding Remarks

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: