Mining Topics in Documents: Standing on the Shoulders of Big Data

Published on 2014-10-083721 Views

Zhiyuan (Brett) Chen

Topic modeling has been widely used to mine topics from documents. However, a key weakness of topic modeling is that it needs a large amount of data (e.g., thousands of documents) to provide reliable

Research Sessions

Related categories

Presentation

Mining Topics in Documents Standing on the Shoulders of Big Data00:00

topic models require a large amountof docs00:08

Example Task Application00:18

Can we improve modelingusing Big Data?00:29

Human Learning01:00

Motivation01:07

Proposed Model Flow01:39

What’s the knowledge representation?02:13

How does a baby gain knowledge?02:29

Knowledge Representation02:44

Knowledge Extraction02:49

Frequent ItemsetMining (FIM)03:05

Extracting Cannot-Links03:23

Related Work about Cannot-Links03:55

However, both of them assume...04:12

Knowledge Verification04:49

Must-Link Graph05:08

PointwiseMutual Information05:21

Cannot-Links Verification05:46

Proposed Gibbs Sampler06:25

Example06:40

M-GPU - 107:04

M-GPU - 207:21

M-GPU - 307:42

M-GPU - 407:59

M-GPU - 508:06

M-GPU - 608:11

M-GPU - 708:20

M-GPU - 808:31

Evaluation08:52

Model Comparison09:32

Topic Coherence09:53

Topic Coherence Results10:14

Human Evaluation Results10:53

Electronics vs. Non-Electronics11:50

Conclusions12:50

Future Work13:20

Q&A14:02