Universal Coding/Prediction and Statistical (In)consistency of Bayesian inference
published: Feb. 25, 2007, recorded: October 2005, views: 4602
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Part of this talk is based on results of A. Barron (1986) and recent joint work with J. Langford (2004). We introduce the information-theoretic concepts of universal coding and prediction. Under weak conditions on the prior, Bayesian sequential prediction is universal. This means that a code based on the Bayesian predictive distribution allows one to substantially compress data. We give a simple proof of the fact that universality implies consistency of the Bayesian posterior. It follows that Bayesian inconsistency in nonparametric settings (a la Diaconis & Freedman) can only occur if priors are used that do not allow for data compression. This gives a frequentist rationale for Rissanen's Minimum Description Length Principle. We also show that under misspecification, the Bayesian predictions can substantially outperform the predictions of the best distribution in the model. Ironically, this implies that the Bayesian posterior can become *inconsistent*: in some sense good predictive performance implies inconsistency!
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !