event thumbnail image
The 13th International Conference on Knowledge Discovery and Data Mining

Local Decomposition for Rare Class Analysis

author: Junjie Wu, Oregon State University

Description

Given its importance, the problem of predicting rare classes in large-scale multi-labeled data sets has attracted great attentions in the literature. However, the rare-class problem remains a critical challenge, because there is no natural way developed for handling imbalanced class distributions. This paper thus fills this crucial void by developing a method for Classification using lOcal clusterinG (COG). Specifically, for a data set with an imbalanced class distribution, we perform clustering within each large class and produce sub-classes with relatively balanced sizes. Then, we apply traditional supervised learning algorithms, such as Support Vector Machines (SVMs), for classification. Indeed, our experimental results on various real-world data sets show that our method produces significantly higher prediction accuracies on rare classes than state-of-the-art methods. Furthermore, we show that COG can also improve the performance of traditional supervised learning algorithms on data sets with balanced class distributions.

You might be experiencing some problems with Your Video player.
Slides
0:03 Local Decomposition for Rare Class Analysis
0:26 Outline pt 1
0:47 Rare Class Analysis
1:36 Research Motivation – Problems
2:41 Problem Formulation
3:07 Our Contributions
3:41 Outline pt 2
3:49 Directions and Objectives of Our Method
5:09 Algorithm Description
6:16 An Example
7:24 Effect of COG&COG-OS on Rare Class pt 1
8:05 Effect of COG&COG-OS on Rare Class pt 2
8:56 Outline pt 3
9:02 Experimental Design
9:22 The Experimental Setup
9:58 1-1: COG on Imbalanced 2-class Data pt 1
10:56 1-1: COG on Imbalanced 2-Class Data pt 2
11:32 1-2: COG on Imbalanced Multi-Class Data
12:01 1-3: COG vs. Resampling
12:51 1-4: COG on KDDCUP99 Data
13:51 2-1: COG on Balanced Data pt 1
14:41 2-1: COG on Balanced Data pt 2
15:05 2-2: COG vs. Random Partitioning
15:44 3: Discussion on Feature Selection
16:48 Outline pt 4
16:54 Related Work
17:31 Outline pt 5
17:35 Concluding Remarks
18:07 Thank You!

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: