The Use of Randomization and Statistical Significance in Data Mining thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

The Use of Randomization and Statistical Significance in Data Mining

Published on Jan 16, 20133158 Views

The concept and theory of statistical significance testing is well established in a traditional setup, but not in the problem settings related to data mining. In this talk I discuss the formulation as

Related categories

Chapter list

The Use of Randomization and Statistical Significance in Data Mining00:00
Contents00:01
Helsinki DM group 201200:36
Recent alumni01:49
Other approaches to the learning problem02:44
“Clarifying” analogue02:51
Learning problem03:52
Bayesian learning05:41
Collaborative filtering07:20
Algorithmic approach (1)07:50
Algorithmic approach (2)08:27
Other approaches09:07
Traditional statistcal significance testing (1)09:20
Traditional statistcal significance testing (2)09:50
Statistical significance testing11:55
Multiple hypothesis testing (1)12:33
Multiple hypothesis testing (2)13:08
Data mining formulation (1)14:22
Data mining formulation (2)14:44
Data mining formulation (3)15:48
Correlations and co-­occurrences (1)16:49
Correlations and co-­occurrences (2)17:00
Correlations and co-­occurrences (3)17:11
Correlations and co-­occurrences (4)18:15
Correlations and co-­occurrences (5)19:04
Correlations and co-­occurrences (6)19:35
Correlations and co-­occurrences (7)20:48
Correlations and co-­occurrences (8)22:17
Tell me what I do not know23:45
Tell me something I don’t know (1)23:55
Tell me something I don’t know (2)24:28
Randomize = sample from Pr(ω)25:33
Constrain RG25:44
Randomize with constraint RG26:11
Tell me something I don’t know (3)26:24
Tell me something I don’t know (4)27:12
Tell me something I don’t know (5)28:16
Tell me something I don’t know (6)29:47
Most informative set of patterns (1)30:42
Most informative set of patterns (2)30:53
Most informative set of patterns (3)32:33
Most informative set of patterns (4)32:59
Most informative set of patterns (5)33:37
Most informative set of patterns (6)34:09
Most informative set of patterns (7)35:44
Most informative set of patterns (8)36:02
Most informative set of patterns (9)36:55
Most informative set of patterns (10)39:02
Most informative set of patterns (11)40:37
Time series segmentation (1)41:54
Time series segmentation (2)42:39
Time series segmentation (3)43:47
Time series segmentation (4)44:30
Time series segmentation (5)44:45
Time series segmentation (6)45:18
Time series segmentation (7)47:00
Time series segmentation (8)47:21
Agglomerative hierarchical clustering47:48
Most informative set of patterns (12)48:43
ptdm2012_puolamaeki_statistical_significance_01_Page_6049:39
Relation to Bayesian learning (1)49:49
Bayesian learning (1)50:20
Bayesian learning (2)50:55
Bayesian learning (3)51:26
Bayesian learning (4)51:49
Bayesian learning (5)52:03
Bayesian learning (6)52:20
Bayesian learning (7)52:37
Bayesian learning (8)53:17
Summary 54:35