Stability Selection for High-Dimensional Data

author: Peter Bühlmann, ETH Zurich
published: Dec. 18, 2008,   recorded: December 2008,   views: 1583
Categories

Slides

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Bibliography

Description

Despite remarkable progress over the past 5 years, estimation of high- dimensional structure, such as in graphical modeling, cluster analysis or variable selection in (generalized) regression, remains difficult. Among the main problems are: (i) the choice of an appropriate amount of regularization; (ii) a potential lack of stability of a solution and quantification of evidence or significance of a selected structure or of a set of selected variables. We introduce the new method of stability selection which addresses these two ma jor problems for high-dimensional structure estimation, both from a practical and theoretical point of view. Stability selection is based on sub- sampling in combination with (high-dimensional)selection algorithms. As such, the method is extremely general and has a very wide range of ap- plicability. Stability selection provides finite sample control for some error rates of false discoveries and hence a transparent principle to choose a proper amount of regularization for structure estimation or model selection. Maybe even more importantly, results are typically remarkably insensitive to the chosen amount of regularization. Another property of stability selection is the empirical and theoretical improvement over pre-specified selection meth- ods. We prove for randomized Lasso that stability selection will be model selection consistent even if the necessary conditions needed for consistency of the original Lasso method are violated. We demonstrate stability selection for variable selection, Gaussian graphical modeling and clustering, using real and simulated data. This is joint work with Nicolai Meinshausen.

See Also:

Download slides icon Download slides: sip08_buhlmann_ssfhd_01.pdf (1.8 MB)


Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: