## Dirichlet Processes: Tutorial and Practical Course

published: Aug. 27, 2007, recorded: August 2007, views: 140194

# Slides

# Related content

# Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our**to describe your request and upload the data.**

__ticket system__*Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.*

# Description

**The Bayesian approach** allows for a coherent framework for dealing with uncertainty in machine learning. By integrating out parameters, Bayesian models do not suffer from overfitting, thus it is conceivable to consider models with infinite numbers of parameters, aka Bayesian nonparametric models. An example of such models is the Gaussian process, which is a distribution over functions used in regression and classification problems. Another example is the Dirichlet process, which is a distribution over distributions. Dirichlet processes are used in density estimation, clustering, and nonparametric relaxations of parametric models. It has been gaining popularity in both the statistics and machine learning communities, due to its computational tractability and modelling flexibility.

In the tutorial I shall introduce Dirichlet processes, and describe different representations of Dirichlet processes, including the Blackwell-MacQueen? urn scheme, Chinese restaurant processes, and the stick-breaking construction. I shall also go through various extensions of Dirichlet processes, and applications in machine learning, natural language processing, machine vision, computational biology and beyond.

In the practical course I shall describe inference algorithms for Dirichlet processes based on Markov chain Monte Carlo sampling, and we shall implement a Dirichlet process mixture model, hopefully applying it to discovering clusters of NIPS papers and authors.

# See Also:

Download slides: teh_yee_whye_dp_talk.pdf (1.8 MB)

Download article: teh_yee_whye_dp_article.pdf (142.3 KB)

# Link this page

Would you like to put a link to this lecture on your homepage?

Go ahead! Copy the HTML snippet !

## Reviews and comments:

Afzal Bhatti, September 18, 2007 at 12:40 p.m.:nice

Prasenjit Mukherjee, September 11, 2008 at 1:26 p.m.:One of the best tutorial on understanding Gaussian/Dirichlet Distribution/Process.

-Prasen

xiaopingzhang, December 26, 2008 at 10:51 a.m.:Thank you!The tutorial is much help to me because I am studying LDA model.

Aditi Gupta, January 17, 2009 at 9:56 p.m.:Very nice lecture. I really liked how the concepts were introduced and linked together. Very well explained. Thank You!!

teddy, July 23, 2009 at 2:38 p.m.:Shouldn't the formula of posterior over parameters be;

p(w|x,y) = p(w|x)p(y|x,w) / p(y|x)

instead of

p(w|x,y) = p(w)p(y|x,w) / p(y|x)

on slide 5 (time 4:37)?

If not, could anyone kindly tell me why it is ok to take away the conditional of x from the prior?

thanks

Cauchy, July 26, 2009 at 1:25 p.m.:Couldn't it be that w is independent with x?

Rajib Acharya, March 2, 2010 at 10:23 p.m.:Very nice. Is the practical session recorded?

Brian, April 26, 2010 at 5:15 a.m.:p(w|x,y)p(x,y) = p(x,y,w) = p(y|x,w)p(x|w)p(w)

using Bayes rule and as previous poster mentioned x indep. of w

=>

p(w|x,y) = p(y|x,w)p(x)p(w) / (p(y|x)p(x))

= p(y|x,w)p(w) / p(y|x)

As in the slides

fjanoos, May 1, 2010 at 10:09 p.m.:In the slide on de Finetti's theorem, he says "if there exists a sequence of thetas that are exchangeable then there exists a *random* probability measure - a random distribution - which makes the theta's iid"

My question is what is a *random* probability measure ? I.e. does the measure itself depend / vary on the underlying sample space X - and if so, how ?

The wikipedia definition of this theorem does not seem to imply any dependence on X. Any clarifications would be appreciated !

QiangYou, November 9, 2010 at 3:43 a.m.:nice talk! a big help for me. ^_^

## Write your own review or comment: