Workshop on Modelling in Classification and Statistical Learning, Eindhoven 2004
The present workshop addresses the problem of predicting a - binary - label Y from given the feature X. A procedure for classification is to be learned from a training set (X1, Y1) , ... , (Xn , Yn ). In the statistical literature on classification, the training set is traditionally seen as an i.i.d. sample from the distribution P of (X,Y), but one otherwise does not assume any a priori knowledge on P. Theoretical results have been derived that hold no matter what P is, which typically means that such results concentrate on worst cases. There are various reasons to step aside from this so-called black box approach. For example, the by now generally accepted rule ``regression is harder that classification" has led to a bad name for certain "plug in" methods, although under distributional assumptions the latter are at least competitive with ``direct" methods. Moreover, theoretical results for a case where P is assumed to be within a small class, can give benchmarks on what one may hope for. Also, procedures which adapt to properties of P need further exploration. These procedures are designed to work well in case one is "lucky", and are as such also inspired by having certain distributional assumptions in the back of ones mind. It moreover is often quite reasonable to assume some knowledge of the marginal distribution of X.