event thumbnail image
The 25th International Conference on Machine Learning (ICML 2008)

Dirichlet Component Analysis: Feature Extraction for Compositional Data

author: Hua-Yan Wang, Peking University

Description

We consider feature extraction (dimensionality reduction) for compositional data, where the data vectors are constrained to be positive and constant-sum. In real-world problems, the data components (variables) usually have complicated "correlations" while their total number is huge. Such scenario demands feature extraction. That is, we shall de-correlate the components and reduce their dimensionality. Traditional techniques such as the Principle Component Analysis (PCA) are not suitable for these problems due to unique statistical properties and the need to satisfy the constraints in compositional data. This paper presents a novel approach to feature extraction for compositional data. Our method first identifies a family of dimensionality reduction projections that preserve all relevant constraints, and then finds the optimal projection that maximizes the estimated Dirichlet precision on projected data. It reduces the compositional data to a given lower dimensionality while the components in the lower-dimensional space are de-correlated as much as possible. We develop theoretical foundation of our approach, and validate its effectiveness on some synthetic and real-world datasets.

You might be experiencing some problems with Your Video player.
Slides
0:00 Dirichlet Component Analysis: Feature Extraction for Compositional Data
0:33 Storyline
0:41 Storyline - Intro
0:42 Intro - 1
0:59 Intro - 2
1:41 Storyline - A Toy Example
1:46 A Toy Example - 1
2:54 A Toy Example - 2
3:17 A Toy Example - 3
3:38 A Toy Example - 4
4:09 A Toy Example - 5
4:59 Storyline - DCA
5:05 DCA - 1
6:51 DCA - 2
7:31 DCA - 3
8:23 DCA - 4
9:06 DCA - 5
10:09 DCA - 6
10:59 DCA - 7
12:08 Storyline - Experiment Results
12:13 Experiment Results (Synthetic Data) - 1
12:47 Experiment Results (Synthetic Data) - 2
13:56 Experiment Results (Real-World Data) - 1
15:11 Experiment Results (Real-World Data) - 2
15:46 Experiment Results (Real-World Data) - 3
16:27 Thanks!
16:31 - Questions

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: