Topic Models
author: David Blei,
Computer Science Department, Princeton University
published: Nov. 2, 2009, recorded: September 2009, views: 17036
published: Nov. 2, 2009, recorded: September 2009, views: 17036
You might be experiencing some problems with Your Video player.
Slides
Related content
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
See Also:
Launch in a standalone WM Player
Switch to Windows Media Player
Download slides:
mlss09uk_blei_tm.pdf (8.6 MB)
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !









Reviews and comments:
Extreme clarity in explaining the complex LDA concepts. What started as mythical, was clarified by the genius David Blei, an astounding teacher researcher. Simply superb!
Your concept is completely wrong. Topics are distributed differently, not as Dirichlet prior. The creator of document choose only 1 or 2 topics and rarely 3 topics when making document, while number of topics in collection could be a hundred. I can write document about environment and how U.S. Congress treats it, so it will be document on 2 topics. It may be more but with lower probability and it can't be 20 or 30.
The right model is 2 urns with colored balls. First urn has one ball of each color. User draw 1 or 2 balls from it. Then user use other urn with many balls of each color. User draw selected number of balls but ignores other than colors of selected from previous urn. Example: first urn has 1 red, 1 green, 1 blue, 1 yellow. I choose randomly red and green. The second urn has 1000 red, 1000 green, 1000 blue, 1000 yellow. I draw 10 ball but keep only red and green. Let say outcome is 7 red and 3 green, this is topic distribution. It is not Dirichlet process. This is how it works in real life.
I can also add that this model does not consider that some topics have no chances to be in one document. Show me the document speaking about marine mammals, linear algebra, crime rate in New York and tea ceremony in Japan at the same time. Two parameters, introduced in concept (alpha and beta) do not control topics to the required degree. Concept works and returns reasonable result because it is close to PLSA and PLSA is close to non-negative matrix factorization.
How do I download these videos? These don't even buffer...
This website has awesome lectures but I'm never able to check... They don't even buffer with me also...
Is there any chance that the whole slides from this presentation are somewhere available? Especially by the end of the second video, some really important parts are left out due to time constraint
Please ignore the comment above.. next time i will check before i post ^^
Write your own review or comment: