Feature selection, fundamentals and applications
Description
Variable and feature selection have become the focus of much research
in areas of application for which datasets with tens or hundreds of thousands
of variables are available. These areas include text processing of internet
documents, gene expression array analysis, and combinatorial chemistry.
The objective of variable selection is three-fold: improving the prediction performance
of the predictors, providing faster and more cost-effective predictors,
and providing a better understanding of the underlying process that generated the
data.
This presentation will cover a wide range of aspects of such problems: providing
a better definition of the objective function, feature construction, feature ranking,
multivariate feature selection, efficient search methods, and feature validity
assessment methods.
Most feature selection methods do not attempt to uncover causal relationships
between feature and target and focus instead on making best predictions.We will
examine situations in which the knowledge of causal relationships benefits feature
selection. Such benefits may include: explaining relevance in terms of causal
mechanisms, distinguishing between actual features and experimental artifacts,
predicting the consequences of actions performed by external agents, and making
predictions in non-stationary environments.
| Slides | |
| 0:00 | Feature selection and causal discovery fundamentals and applications |
| 0:17 | Feature Selection |
| 1:04 | Leukemia Diagnosis |
| 3:18 | Prostate Cancer Genes |
| 4:19 | RFE SVM for cancer diagnosis |
| 6:13 | QSAR: Drug Screening |
| 7:15 | Text Filtering |
| 8:53 | Face Recognition |
| 10:16 | Nomenclature |
| 11:17 | Univariate Filter Methods |
| 11:26 | Individual Feature Irrelevance |
| 13:21 | S2N |
| 15:46 | Univariate Dependence |
| 18:28 | Other criteria ( chap. 3) |
| 19:31 | T-test |
| 20:48 | Statistical tests ( chap. 2) |
| 27:13 | Multivariate Methods |
| 28:04 | Univariate selection may fail |
| 30:55 | Filters,Wrappers, and Embedded methods |
| 32:09 | Relief |
| 34:59 | Wrappers for feature selection |
| 36:12 | Search Strategies ( chap. 4) |
| 37:05 | Feature subset assessment |
| 38:31 | Three “Ingredients” |
| 40:08 | Forward Selection (wrapper) |
| 41:31 | Forward Selection (embedded) |
| 42:48 | Forward Selection with GS |
| 43:49 | Forward Selection w. Trees |
| 44:53 | Backward Elimination (wrapper) |
| 45:20 | Backward Elimination (embedded) |
| 45:31 | Backward Elimination: RFE |
| 46:37 | Scaling Factors |
| 59:16 | Learning with scaling factors |
| 60:56 | Formalism ( chap. 5) |
| 62:32 | Add/Remove features |
| 63:37 | Recursive Feature Elimination |
| 65:01 | Gradient descent |
| 66:00 | Minimization of a sparsity function |
| 67:27 | The l1 SVM |
| 68:26 | Mechanical interpretation |
| 73:31 | The l0 SVM |
| 76:27 | Embedded method - summary |
| 77:22 | Causality |
| 77:33 | Variable/feature selection |
| 77:48 | What can go wrong? (1) |
| 80:51 | What can go wrong? (2) |
| 81:18 | What can go wrong? (3) |
| 85:09 | Causal feature selection |
| 85:29 | Causal feature relevance |
| 89:42 | Formalism: Causal Bayesian networks |
| 90:42 | Example of Causal Discovery Algorithm |
| 92:35 | Computational and statistical complexity |
| 93:15 | A prototypical MB algo: HITON |
| 93:34 | 1 – Identify variables with direct edges to the target (parent/children) (1) |
| 93:40 | 1 – Identify variables with direct edges to the target (parent/children) (2) |
| 94:07 | 2 – Repeat algorithm for parents and children of Y (get depth two relatives) |
| 94:51 | Causal feature relevance |
| 96:31 | 2 – Repeat algorithm for parents and children of Y (get depth two relatives) |
| 96:36 | 3 – Remove non-members of the MB |
| 96:49 | Wrapping up |
| 97:16 | Complexity of Feature Selection |
| 102:47 | Examples of FS algorithms |
| 104:33 | The CLOP Package |
| 105:05 | NIPS 2003 FS challenge |
| 106:33 | Conclusion |
| 107:39 | Acknowledgements and references |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




