event thumbnail image
Carnegie Mellon Machine Learning Lunch seminar

Rare Category Detection for Spatial Data

author: Jingrui He, +Machine Learning Department; School of Computer Science; Carnegie Mellon University

Description

Given an unlabeled unbalanced data set, the goal of rare category detection is to discover examples from the minority classes with a few label requests. Rare category detection is an open challenge in machine learning, and it has a lot of applications, such as financial fraud detection, network intrusion detection, astronomy, spam image detection, etc. In this talk, I will introduce two methods for rare category detection with spatial data. The first one essentially performs local density differential sampling, and it requires the prior information about the data set as input. The second one is based on specially designed exponential families, and it is prior-free. Experimental results demonstrate the effectiveness of these methods on different real data sets.

You might be experiencing some problems with Your Video player.
Slides
0:00 Rare Category Detection
0:35 What’s Rare Category Detection
1:45 Comparison with Outlier Detection
2:55 Comparison with Active Learning
3:37 Applications
4:31 The Big Picture
5:40 Outline
6:04 Related Work
7:24 Outline
7:31 Notations
8:24 Assumptions
9:08 Overview of the Algorithms
9:30 Two Classes: NNDB
11:08 NNDB: Calculate Class-Specific Radius
12:00 NNDB: Calculate Nearest Neighbors
12:23 NNDB: Calculate the Scores
13:31 NNDB: Pick the Next Candidate
14:03 Why NNDB Works
14:58 Multiple Classes: ALICE
15:52 Why ALICE Works
16:15 Implementation Issues
17:04 Results on Synthetic Data Sets
18:24 Summary of Real Data Sets
19:18 Results on Real Data Sets
21:01 Imprecise priors
22:07 Outline
22:14 Overview of the Algorithm
22:47 Specially Designed Exponential Families
23:40 SEDER Algorithm
25:04 Parameter Estimation
25:57 Parameter Estimation cont.
26:43 Scoring Function
27:45 Results on Synthetic Data Sets
28:53 Summary of Real Data Sets
29:59 Moderately Skewed Data Sets
30:23 Extremely Skewed Data Sets
31:31 Conclusion
32:38 Thank You!
32:47 - Questions
32:55 - Questions
33:48 - Questions
34:35 - Questions
35:12 - Questions
35:17 - Questions
35:32 - Questions
37:34 Thank You!

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: