Dimensionality Reduction by Feature Selection in Machine Learning
Description
Dimensionality reduction is a commonly used step in machine learning, especially when dealing with a high dimensional space of features. The original feature space is mapped onto a new, reduced dimensioanllyity space and the examples to be used by machine learning algorithms are represented in that new space. The mapping is usually performed either by selecting a subset of the original features or/and by constructing some new features. This persentation deals with the first approach, feature subset selection. We provide a brief overview of the feature subset selection techniques that are commonly used in machine learning and give a more detailed description of feature subset selection used in machine learning on text data. Performance of some methods used is document categorization is illustrated by providing experimental comparison on real-world data collected from the Web.
| Slides | |
| 0:01 | Dimensionality Reduction by Feature Selection in Machine Learning |
| 0:15 | Reasons for dimensionality reduction |
| 0:59 | Approaches to dimensionality reduction |
| 3:02 | Example for the problem |
| 4:05 | Search for feature subset |
| 4:37 | Feature subset selection |
| 5:18 | Approaches to feature subset selection |
| 6:50 | Filtering |
| 7:41 | Filters: Distribution-based [Koller & Sahami 1996] |
| 8:23 | Filters: Relief [Kira & Rendell 1992] |
| 9:32 | Filters: FOCUS [Almallim & Dietterich 1991] |
| 11:00 | Illustration of FOCUS |
| 11:37 | Filters: Random [Liu & Setiono 1996] |
| 12:43 | Filters: MDL-based [Pfahringer 1995] |
| 13:30 | Wrapper |
| 14:52 | Wrappers: Instance-based learning |
| 15:42 | Wrappers: Decision tree induction |
| 16:48 | Metric-based model selection |
| 18:56 | Embedded |
| 19:15 | Embedded |
| 20:19 | Embedded: in filters [Cardie 1993] |
| 21:03 | Simple Filtering |
| 21:43 | Feature subset selection on text data – commonly used methods |
| 25:08 | Scoring individual feature |
| 28:46 | Influence of feature selection on the classification performance |
| 29:41 | Illustration of feature selection |
| 30:22 | Illustration on 5 datasets from Yahoo! hierarchy using Naïve Bayes [Mladenic & Grobelnik 2003] |
| 32:25 | CrossEntropy |
| 34:28 | Rank of the correct category in the list of all categories F2-measure combining precision and recall emphases on recall Ctgs – number of categories looking promising (testing example needs to be class |
| 39:18 | Illustration on Reuters-2000 Data [Brank et al 2002] |
| 40:03 | Experiments with Naïve Bayes Classifier |
| 41:08 | Average number of nonzero components per vector instead of the overall no. of features |
| 41:11 | Experiments with Perceptron Classifier |
| 41:46 | Experiments with the Linear SVM Classifier |
| 42:30 | Discussion Using discarded features can help |
| 44:19 | Discussion |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




