Experiments with Non-parametric Topic Models
published: Feb. 20, 2015, recorded: January 2015, views: 3523
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
This talk will cover some of our recent work in extended topic models to serve as tools in text mining and NLP (and hopefully, later, in IR) when some semantic analysis is required. In some sense our goals are akin to the use of Latent Semantic Analysis. The basic theoretical/algorithmic tool we have for this is non-parametric Bayesian methods for reasoning on hierarchies of probability vectors.
The concepts will be introduced but not the statistical detail. Then I'll present some of our KDD 2014 paper (Experiments with Non-parametric Topic Models), and some extended work such as "Bibliographic Analysis with the Citation Network Topic Model" (ACML 2014) and "Topic Segmentation with a Structured Topic Model" (NAACL 2013). Various valuations and comparisons will be made. The fully non-parametric topic model with burstiness is currently the best performing published model by a number of measures and is only a small factor slower in speed (and small factor larger in memory) than standard LDA implementations.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !