On the Borders of Statistics and Computer Science
published: Feb. 25, 2007, recorded: May 2005, views: 2753
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Machine learning in computer science and prediction and classification in statistics are essentially equivalent fields. I will try to illustrate the relation between theory and practice in this huge area by a few examples and results. In particular I will try to address an apparent puzzle: Worst case analyses, using empirical process theory, seem to suggest that even for moderate data dimension and reasonable sample sizes good prediction (supervised learning) should be very difficult. On the other hand, practice seems to indicate that even when the number of dimensions is very much higher than the number of observations, we can often do very well. We also discuss a new method of dimension estimation and some features of cross validation.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !