A Spectral Clustering Approach to Optimally Combining Numerical Vectors with a Modular Network
Description
We address the issue of clustering numerical vectors with a network. The problem setting is basically equivalent to constrained clustering by Wagstaff and Cardie [20] and semisupervised clustering by Basu et al. [2], but our focus is more on the optimal combination of two heterogeneous data sources. An application of this setting is web pages which can be numerically vectorized by their contents, e.g. term frequencies, and which are hyperlinked to each other, showing a network. Another typical application is genes whose behavior can be numerically measured and a gene network can be given from another data source. We first define a new graph clustering measure which we call normalized network modularity, by balancing the cluster size of the original modularity. We then propose a new clustering method which integrates the cost of clustering numerical vectors with the cost of maximizing the normalized network modularity into a spectral relaxation problem. Our learning algorithm is based on spectral clustering which makes our issue an eigenvalue problem and uses k-means for final cluster assignments. A significant advantage of our method is that we can optimize the weight parameter for balancing the two costs from the given data by choosing the minimum total cost. We evaluated the performance of our proposed method using a variety of datasets including synthetic data as well as real-world data from molecular biology. Experimental results showed that our method is effective enough to have good results for clustering by numerical vectors and a network.
| Slides | |
| 0:03 | A Spectral Clustering Approach to Optimally Combining Numerical Vectors with a Modular Network |
| 0:18 | Table of Contents |
| 0:47 | Heterogeneous Data Clustering |
| 2:10 | Related work |
| 3:29 | Table of Contents |
| 3:37 | Spectral Clustering |
| 5:31 | Cost combining numerical vectors with a network |
| 6:19 | Complex Networks |
| 7:02 | Network Modularity |
| 8:19 | Cost Combining Numerical Vectors with a Network |
| 8:54 | Our Proposed Spectral Clustering |
| 9:50 | Table of Contents |
| 9:57 | Synthetic Data |
| 11:45 | Results for Synthetic Data (1) |
| 13:10 | Results for Synthetic Data (2) |
| 13:16 | Synthetic Data (Numerical Vector) + Real Data (Gene Network) |
| 14:50 | Summary |
| 15:35 | Thank you for your attention! |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !





