From Trees to Forests and Rule Sets - A Unified Overview of Ensemble Methods

author: Giovanni Seni, Santa Clara University
author: John Elder, Elder Research, Inc.
published: Aug. 14, 2007,   recorded: August 2007,   views: 28452


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.

 Watch videos:   (click on thumbnail to launch)

Watch Part 1
Part 1 59:31
Watch Part 2
Part 2 1:45:08


Ensemble methods are one of the most influential developments in Machine Learning over the past decade. They perform extremely well in a variety of problem domains, have desirable statistical properties, and scale well computationally. By combining competing models into a committee, they can strengthen “weak” learning procedures.

This tutorial explains two recent developments with ensemble methods:
Importance Sampling reveals “classic” ensemble methods (bagging, random forests, and boosting) to be special cases of a single algorithm. This unified view clarifies the properties of these methods and suggests ways to improve their accuracy and speed.
Rule Ensembles are linear rule models derived from decision tree ensembles. While maintaining (and often improving) the accuracy of the tree ensemble, the rule-based model is much more interpretable.

This tutorial is aimed at both novice and advanced data mining researchers and practitioners especially in Engineering, Statistics, and Computer Science. Users with little exposure to ensemble methods will gain a clear overview of each method. Advanced practitioners already employing ensembles will gain insight into this breakthrough way to create next-generation models.

John Elder's lecture:
In a Nutshell, Examples & Timeline
Predictive Learning
Decision Trees
Giovanni Seni's lecture:
Model Selection (Bias-Variance Tradeoff , Regularization via shrinkage)
Ensemble Learning & Importance Sampling (ISLE)
Generic Ensemble Generation
Bagging, Random Forest, AdaBoost, MART
Rule Ensembles

See Also:

Download slides icon Download slides: kdd07_elder_seni_fttf.pdf (1.9 MB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: