NIPS Workshop on Representations and Inference on Probability Distributions, Whistler 2007
When dealing with distributions it is in general infeasible to estimate them explicitly in high dimensional settings, since the associated learning rates can be arbitrarily slow. On the other hand, a great variety of applications in machine learning and computer science require distribution estimation and/or comparison. Examples include testing for homogeneity (the "two-sample problem"), independence, and conditional independence, where the last two can be used to infer causality; data set squashing / data sketching / data anonymisation; domain adaptation (the transfer of knowledge learned on one domain to solving problems on another, related domain) and the related problem of covariate shift; message passing in graphical models (EP and related algorithms); compressed sensing; and links between divergence measures and loss functions.
The purpose of this workshop is to bring together statisticians, machine learning researchers, and computer scientists working on representations of distributions for various inference and testing problems, to discuss the compromises necessary in obtaining useful results from finite data. In particular, what are the capabilities and weaknesses of different distribution estimates and comparison strategies, and what negative results apply?