The Neural Autoregressive Distribution Estimator, incl. discussion by Yoshua Bengio

author: Yoshua Bengio, Department of Computer Science and Operations Research, University of Montreal
author: Hugo Larochelle, Google, Inc.
published: May 6, 2011,   recorded: April 2011,   views: 7336


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


We describe a new approach for modeling the distribution of high-dimensional vectors of discrete variables. This model is inspired by the restricted Boltzmann machine (RBM), which has been shown to be a powerful model of such distributions. However, an RBM typically does not provide a tractable distribution estimator, since evaluating the probability it assigns to some given observation requires the computation of the so-called partition function, which itself is intractable for RBMs of even moderate size. Our model circumvents this difficulty by decomposing the joint distribution of observations into tractable conditional distributions and modeling each conditional using a non-linear function similar to a conditional of an RBM. Our model can also be interpreted as an autoencoder wired such that its output can be used to assign valid probabilities to observations. We show that this new model outperforms other multivariate binary distribution estimators on several datasets and performs similarly to a large (but intractable) RBM.

See Also:

Download slides icon Download slides: aistats2011_larochelle_neural_01.pdf (1.4┬áMB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Vik, September 23, 2014 at 10:26 p.m.:

At ~3:50 what does he mean by "It gives you a good model but not a good distribution estimator". Aren't they the same thing?

Write your own review or comment:

make sure you have javascript enabled or clear this field: