event thumbnail image
Solomonovi seminarji

Which Supervised Learning Method Works Best for What? An Empirical Comparison of Learning Methods and Metrics

author: Rich Caruana, Cornell University

Description

Decision trees are intelligible, but do they perform well enough that you should use them? Have SVMs replaced neural nets, or are neural nets still best for regression, and SVMs best for classification? Boosting maximizes margins similar to SVMs, but can boosting compete with SVMs? And if it does compete, is it better to boost weak models, as theory might suggest, or to boost stronger models? Bagging is simpler than boosting -- how well does bagging stack up against boosting? Breiman said Random Forests are better than bagging and as good as boosting. Was he right? And what about old friends like logistic regression, KNN, and naive bayes? Should they be relegated to the history books, or do they still fill important niches?
In this talk we compare the performance of ten supervised learning methods on nine criteria: Accuracy, F-score, Lift, Precision/Recall Break-Even Point, Area under the ROC, Average Precision, Squared Error, Cross-Entropy, and Probability Calibration. The results show that no one learning method does it all, but some methods can be "repaired" so that they do very well across all performance metrics. In particular, we show how to obtain the best probabilities from max margin methods such as SVMs and boosting via Platt's Method and isotonic regression. We then describe a new ensemble method that combines select models from these ten learning methods to yield much better performance. Although these ensembles perform extremely well, they are too complex for many applications. We'll describe what we're doing to try to fix that. Finally, if time permits, we'll discuss how the nine performance metrics relate to each other, and which of them you probably should (or shouldn't) use.
During this talk I'll briefly describe the learning methods and performance metrics to help make the lecture accessible to non-specialists in machine learning.

You might be experiencing some problems with Your Video player.
Slides
1:31 An Empirical Comparison of Learning Methods++
3:22 Preliminaries: What is Supervised Learning?
3:46 Sad State of Affairs: Supervised Learning
4:21 Sad State of Affairs: Supervised Learning
4:31 Sad State of Affairs: Supervised Learning
4:33 A Real Decision Tree
4:34 Not ALL Decision Trees Are Intelligible
4:36 Sad State of Affairs: Supervised Learning
5:14 A Typical Neural Net
5:30 Linear Regression
5:56 Logistic Regression
6:13 Sad State of Affairs: Supervised Learning
6:25 Sad State of Affairs: Supervised Learning
9:22 Questions
11:25 Data Sets
14:01 Binary Classification Performance Metrics
18:05 Normalized Scores
19:44 Massive Empirical Comparison
20:47 Look at Predicting Probabilities First
21:47 Results on Test Sets (Normalized Scores)
26:50 Bagged Decision Trees
27:56 Bagging Results
29:50 Random Forests (Bagged Trees++)
33:29 Calibration & Reliability Diagrams
37:18 Back to SVMs: Results on Test Sets
37:39 SVM Reliability Plots
38:31 Platt Scaling by Fitting a Sigmoid
39:21 Results After Platt Scaling SVMs
40:22 Results After Platt Scaling SVMs
43:16 Results After Platt Scaling SVMs
43:34 Summary of Model Performances
44:25 Smart Model ? Good Probs
44:55 Ada Boosting
47:04 Why Boosting is Not Well Calibrated
49:31 Consistent With Interpretations of Boosting
50:48 Platt Scaling of Boosted Trees (7 problems)
52:02 Results After Platt Scaling All Models
53:50 Revenge of the Decision Tree!
56:52 Methods for Achieving Calibration
58:11 Boosting with Log-Loss
60:08 Isotonic Regression
60:12 Isotonic Regression
61:37 Platt Scaling vs. Isotonic Regression
62:17 Platt Scaling vs. Isotonic Regression
64:46 Summary: Before/After Calibration
65:52 Where Does That Leave Us?
66:32 Best of the Best of the Best
70:00 If we need to train all models and pick best, can we do better than picking best?
71:00 Normalized Scores of Ensembles
71:54 Basic Ensemble Selection Algorithm
72:03 Basic Ensemble Selection Algorithm
72:28 Basic Ensemble Selection Algorithm
73:06 Big Problem: Overfitting
73:27 Normalized Scores for ES
74:05 Ensemble Selection vs Best: 3 NLP Problems
74:12 Ensemble Selection Works, But Is It Worth It?
74:13 Computational Cost
74:29 Ensemble Selection
74:52 Best Ensembles are Big and Ugly!
75:22 Solution: Model Compression

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Dan, July 11, 2007 at 10:48 p.m.:

Very well-delivered talk and very useful, combined with the slides. The inclusion of background to some of the techniques is useful, helps contextualise the comparisons nicely.

(The "View slides" option makes it a bit difficult to read the text in the tables, by the way.)


Comment2 KEBI_Weiwei, May 27, 2008 at 8:26 p.m.:

Nice talk!

The link of the slides is wrong, btw.


Write your own review or comment: