Feature Selection - From Correlation to Causality
published: Dec. 20, 2008, recorded: December 2008, views: 870
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available. These areas include text processing of internet documents, gene expression array analysis, and combinatorial chemistry. The ob jective of variable selection is three-fold: improving the prediction performance of the predictors, providing faster and more cost-eﬀective predictors, and providing a better understanding of the underlying process that generated the data. This tutorial will cover a wide range of aspects of such problems: providing a better deﬁnition of the ob jective function, feature construction, feature ranking, multivariate feature selection, eﬃcient search methods, and feature validity assessment methods. Most feature selection methods do not attempt to uncover causal relationships between feature and target and focus instead on making best predictions. We will examine situations in which the knowledge of causal relationships beneﬁts feature selection. Such beneﬁts may include: explaining relevance in terms of causal mechanisms, distinguishing between actual features and experimental artifacts, predicting the consequences of actions performed by external agents, and making predictions in non-stationary environments.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !