event thumbnail image
Solomonovi seminarji

Spooky Stuff in Metric Space

author: Rich Caruana, Cornell University

Description

Decision trees are intelligible, but do they perform well enough that you should use them? Have SVMs replaced neural nets, or are neural nets still best for regression, and SVMs best for classification? Boosting maximizes margins similar to SVMs, but can boosting compete with SVMs? And if it does compete, is it better to boost weak models, as theory might suggest, or to boost stronger models? Bagging is simpler than boosting -- how well does bagging stack up against boosting? Breiman said Random Forests are better than bagging and as good as boosting. Was he right? And what about old friends like logistic regression, KNN, and naive bayes? Should they be relegated to the history books, or do they still fill important niches?
In this talk we compare the performance of ten supervised learning methods on nine criteria: Accuracy, F-score, Lift, Precision/Recall Break-Even Point, Area under the ROC, Average Precision, Squared Error, Cross-Entropy, and Probability Calibration. The results show that no one learning method does it all, but some methods can be "repaired" so that they do very well across all performance metrics. In particular, we show how to obtain the best probabilities from max margin methods such as SVMs and boosting via Platt's Method and isotonic regression. We then describe a new ensemble method that combines select models from these ten learning methods to yield much better performance. Although these ensembles perform extremely well, they are too complex for many applications. We'll describe what we're doing to try to fix that. Finally, if time permits, we'll discuss how the nine performance metrics relate to each other, and which of them you probably should (or shouldn't) use.
During this talk I'll briefly describe the learning methods and performance metrics to help make the lecture accessible to non-specialists in machine learning.

You might be experiencing some problems with Your Video player.
Slides
0:00 Spooky Stuff Data Mining in Metric Space
0:51 Motivation #1
1:06 Motivation #1: Pneumonia Risk Prediction
16:17 Motivation #1: Many Learning Algorithms
16:48 Motivation #2: SLAC B/Bbar
21:56 Motivation #2: Improves Calibration Order of Magnitude
22:38 Motivation #2: Significantly Improves SLQ
23:38 Motivation #2
23:46 Motivation #3
23:47 Motivation #3
24:08 Scary Stuff
26:18 Scary Stuff
27:37 Scary Stuff
35:49 In this work we compare nine commonly used performance metrics by applying data mining to the results of a massive empirical study
36:42 10 Binary Classification Performance Metrics
37:31 lift = 3.5 if mailings sent to 20% of the customers
38:05 better performance
38:58 Predicted 1 Predicted 0
39:21 ROC Plot and ROC Area
40:21 diagonal line is random prediction
40:46 Calibration Plot
41:03 Base-Level Learning Methods
41:08 Data Sets
41:12 Massive Empirical Comparison
41:40 COVTYPE: Calibration vs. Accuracy
44:46 Multi Dimensional Scaling
45:32 Scaling, Ranking, and Normalizing
46:12 Multi Dimensional Scaling
47:17 Multi Dimensional Scaling
47:42 2-D Multi-Dimensional Scaling
51:18 2-D Multi-Dimensional Scaling
52:54 Adult Covertype Hyper-Spectral
53:21 Correlation Analysis
53:31 Rank Correlations
57:05 Summary
58:26 New Resources
58:39 Future/Related Work
58:46 Thank You.

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: