event thumbnail image
NATO Advanced Study Institute on Mining Massive Data Sets for Security

Fitting mixtures of regression lines with the forward search: application to clustering and outlier detection

author: Domenico Perrotta, Joint Research Centre
author: Andrea Cerioli, Università degli Studi di Parma

Description

The forward search is a method for detecting unidentified subsets and masked outliers and for determining their effect on models fitted to the data. This talk describes a semi-automatic approach to outlier detection and clustering through the forward search. We address challenging issues including selection of the number of groups. The performance of the algorithm is shown on several trade data sets relevant for fraud detection problems.

You might be experiencing some problems with Your Video player.
Slides
0:00 Fitting mixtures of regression lines
1:15 Contents
1:48 Multivariate Normality
3:28 Obscured Structure
4:34 An example of masking with clustered data (1)
5:34 An example of masking with clustered data (2)
8:34 An example of masking with clustered data (3)
9:25 Introductory example: lessons
10:17 An example of masking with clustered data (2)
11:29 Introductory example: lessons
11:33 The Forward Search: a few references
12:23 The Forward Search: Structure
16:53 The Forward Search: Distances for Multivariate Outlier Detection
19:33 The Forward Search: One Population
20:36 The Forward Search: Scaled distances
21:35 Swiss Heads
22:43 Swiss Heads 2
23:35 Swiss Heads 3
23:46 The Forward Search: Outliers
25:33 Swiss Heads: Minimum Distance
26:35 The Forward Search: Simulation Envelopes (1)
27:55 The Forward Search: Simulation Envelopes (2)
28:20 The Forward Search: Several Populations
28:41 Swiss Banknote Data (1)
29:50 Swiss Banknote Data (2)
30:39 Swiss Banknote Data 2
31:33 Swiss Banknote Data 3
33:55 Swiss Banknote Data 4
34:28 Forged Swiss Banknotes – Envelopes 1
34:32 Forged Swiss Banknotes – Envelopes 2
36:28 Forged Swiss Banknotes – Envelopes 3
37:05 The Forward Search: Unknown Populations
37:39 Swiss Banknote Data Revisited
37:44 Swiss Banknote Data Revisited 2
39:22 Swiss Banknote Data Revisited 3
39:24 Interrogating the Plots
40:07 Comparison with Standard Cluster Analysis
42:54 Confirmatory steps
42:56 The Forward Search: Distribution
44:00 The Forward Search: Distribution 2
44:44 The Forward Search: Distribution 3
45:40 The Forward Search: Procedure
45:45 The Forward Search: Procedure 2
45:48 The Forward Search: Procedure 3
46:08 The Forward Search: Evidence
48:49 The Forward Search: Evidence 2
49:19 Forward Search: Summary

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

 Watch videos:   (click on thumbnail to launch)

Watch Part 1
Part 1 0:49:42
Flash video Slides Windows Media video

!NOW PLAYING
Watch Part 2
Part 2 0:15:02
Flash video Slides Windows Media video

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: