Sparsity analsysis of term weighting schemes and application to text classification

Published on 2007-02-253456 Views

Janez Brank

We revisit the common practice of feature selection for dimensionality and noise reduction. This typically involves scoring and ranking features based on some weighting scheme and selecting top ranked

Subspace, Latent Structure and Feature Selection Techniques 2005 - Bohinj

Related categories

Presentation

Sparsity Analysis of Term Weighting Schemes and Application to Text Classification00:01

Introduction00:20

Feature Weighting Schemes02:05

Characterization of Feature Rankings in terms of Sparsity08:12

Sparsity Curves11:02

Sparsity as the independent variable16:09

Performance as a function of the number of features (Naïve Bayes, 16 categories of RCV2)18:23

Performance as a function of sparsity20:46

Sparsity as a cutoff criterion21:56

Results24:06

Conclusions25:56

Future work26:32