Rare Category Detection for Spatial Data
Description
Given an unlabeled unbalanced data set, the goal of rare category detection is to discover examples from the minority classes with a few label requests. Rare category detection is an open challenge in machine learning, and it has a lot of applications, such as financial fraud detection, network intrusion detection, astronomy, spam image detection, etc. In this talk, I will introduce two methods for rare category detection with spatial data. The first one essentially performs local density differential sampling, and it requires the prior information about the data set as input. The second one is based on specially designed exponential families, and it is prior-free. Experimental results demonstrate the effectiveness of these methods on different real data sets.
| Slides | |
| 0:00 | Rare Category Detection |
| 0:35 | What’s Rare Category Detection |
| 1:45 | Comparison with Outlier Detection |
| 2:55 | Comparison with Active Learning |
| 3:37 | Applications |
| 4:31 | The Big Picture |
| 5:40 | Outline |
| 6:04 | Related Work |
| 7:24 | Outline |
| 7:31 | Notations |
| 8:24 | Assumptions |
| 9:08 | Overview of the Algorithms |
| 9:30 | Two Classes: NNDB |
| 11:08 | NNDB: Calculate Class-Specific Radius |
| 12:00 | NNDB: Calculate Nearest Neighbors |
| 12:23 | NNDB: Calculate the Scores |
| 13:31 | NNDB: Pick the Next Candidate |
| 14:03 | Why NNDB Works |
| 14:58 | Multiple Classes: ALICE |
| 15:52 | Why ALICE Works |
| 16:15 | Implementation Issues |
| 17:04 | Results on Synthetic Data Sets |
| 18:24 | Summary of Real Data Sets |
| 19:18 | Results on Real Data Sets |
| 21:01 | Imprecise priors |
| 22:07 | Outline |
| 22:14 | Overview of the Algorithm |
| 22:47 | Specially Designed Exponential Families |
| 23:40 | SEDER Algorithm |
| 25:04 | Parameter Estimation |
| 25:57 | Parameter Estimation cont. |
| 26:43 | Scoring Function |
| 27:45 | Results on Synthetic Data Sets |
| 28:53 | Summary of Real Data Sets |
| 29:59 | Moderately Skewed Data Sets |
| 30:23 | Extremely Skewed Data Sets |
| 31:31 | Conclusion |
| 32:38 | Thank You! |
| 32:47 | - Questions |
| 32:55 | - Questions |
| 33:48 | - Questions |
| 34:35 | - Questions |
| 35:12 | - Questions |
| 35:17 | - Questions |
| 35:32 | - Questions |
| 37:34 | Thank You! |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




