Experiments with Non-parametric Topic Models

Published on 2015-02-203559 Views

Wray Buntine

This talk will cover some of our recent work in extended topic models to serve as tools in text mining and NLP (and hopefully, later, in IR) when some semantic analysis is required. In some sense our

Solomon seminar

Related categories

Presentation

Non-parametric Methods for Unsupervised Semantic Modelling00:00

Outline - 101:07

Information Overload02:00

Information Warfare02:06

Outline - 202:20

Probability Vectors02:38

Sharing/Inheritance with a Probability Hierarchy03:18

Overview: Latent Semantic Modelling05:09

Outline - 307:23

Dirichlet distributions07:26

4-D Dirichlet samples07:33

Forms for 3-D Dirichlet09:00

The Dirichlet Distribution09:01

Dirichlet Details09:01

Outline - 409:02

Latent Dirichlet Allocation09:21

Learning Algorithms with Dirichlets11:06

Context Free Grammar12:23

Probabilistic Context Free Grammar, cont.12:34

Bayesian Networks12:44

Bayesian Model Averaging (BMA) for Bayesian Networks12:48

Bayesian Model Averaging (BMA) for Bayesian Networks, cont.12:48

N-grams12:50

Decision Trees13:36

Breiman’s Random Forests or Bagging of Decision Trees13:37

Bayesian Model Averaging and Non-parametrics Storyline13:38

Bayesian Model Averaging and Non-parametrics Storyline, cont.14:03

Motivation15:14

Outline - 515:16

Dirichlet Process15:19

Bayesian Idea: Similar Context Means Similar Word15:23

Bayesian N-grams17:05

Bayesian N-grams, cont.17:10

Historical Context - 117:28

Outline - 618:53

Historical Context - 218:56

The Ideal Hierarchical Component?19:48

Why We Prefer DPs and PYPs over Dirichlets!20:38

Outline - 721:34

Component Models, Generally21:37

Matrix Approximation View22:16

Why Topic Models?23:07

Topic Models: Just an Intermediate Goal24:21

ASIDE: Aspects, Ratings and Sentiments25:14

Evaluation - 125:43

Topic Models: Potential for Semantics27:30

Evaluation - 227:56

Outline - 828:17

Text and Burstiness29:06

Aside: Burstiness and Information Retrieval30:59

Outline - 932:06

Evolution of Models - 132:17

Evolution of Models - 233:50

Previous Work34:21

Evolution of Models - 334:46

Evolution of Models - 435:00

Evolution of Models - 535:24

Outline - 1037:27

Our Non-parametric Topic Model - 137:28

Our Non-parametric Topic Model - 237:28

Our Non-parametric Topic Model - 337:41

Our Non-parametric Topic Model - 437:54

Design Notes - 138:22

Design Notes - 238:23

Design Notes - 338:24

Performance on Reuters-21578 ModLewis Split38:25

Perplexity performance on MLT Data for different Topics39:53

Comparison to PCVB0 and Mallet39:53

Comparison to Bryant+Sudderth (2012) on NIPS data41:01

Comparison to FTM and LIDA41:39

Conclusion on Topic Models - 141:42

Conclusion on Topic Models - 241:45

Conclusion on Topic Models - 341:46

Conclusion on Topic Models - 441:46

Conclusion on Topic Models - 543:05

Outline - 1144:19

Aspect-based Opinion Aggregation44:23

Explaining the Model - 245:12

Explaining the Model - 345:12

Explaining the Model - 145:34

Explaining the Model - 445:52

Outline - 1246:57

Task, Roughly - 147:15

Segmentation Model–Generative process48:20

Experiments on two meeting transcripts48:25

Conclusion of Segmentation with a Structured Topic Model - 149:29

Conclusion of Segmentation with a Structured Topic Model - 249:29

Fun with Bibliographies (Lim etal ACML 2014)49:30

Fun with Aspects and Sentiment (Lim etal CIKM 2014)50:25

Conclusion50:28