# SEARCH RESULTS:

### Search: MLSS - Matches: 336

## Event sections: |
|||

Evening Talks | |||

Students Session | |||

## Events: |
|||

Cognitive Science and Machine Learning Summer School (MLSS), Sardinia 2010Cognitive science aims to reverse engineer human intelligence; machine learning provides one of our most powerful sources of insight into how machine intelligence is possible. Cognitive science therefore raises challenges for, and draws inspiration from, machine learning; and insights about ... | |||

Machine Learning Summer School (MLSS), Berder Island 2004The fourth Machine Learning Summer School was held in Berder Island, France between the 12th and the 25th of September, 2004. More than 100 students and researchers from 20 countries interested in Machine Learning attended. This years' summer school presented ... | |||

Machine Learning Summer School (MLSS ), Bordeaux 2011The school provides tutorials and practical sessions on basic and advanced topics of machine learning by leading researchers in the field. The summer school is intended for students, young researchers and industry practitioners with an interest in machine learning and ... | |||

Machine Learning Summer School (MLSS), Cambridge 2009The 13th Machine Learning Summer School was held in Cambridge, UK. This year's edition was organized by the University of Cambridge, Microsoft Research and PASCAL. The school offered an overview of basic and advanced topics in machine learning through theoretical ... | |||

Machine Learning Summer School (MLSS), Canberra 2002Machine Learning is a foundational discipline of the Information Sciences. It combines deep theory from areas as diverse as Statistics, Mathematics, Engineering, and Information Technology with many practical and relevant real life applications. The aim of the summer school is ... | |||

Machine Learning Summer School (MLSS), Canberra 2005Machine Learning is a foundational discipline of the Information Sciences. It combines deep theory from areas as diverse as Statistics, Mathematics, Engineering, and Information Technology with many practical and relevant real life applications. The aim of the summer school is ... | |||

Machine Learning Summer School (MLSS), Canberra 2006This school is suitable for all levels, both for people without previous knowledge in Machine Learning, and those wishing to broaden their expertise in this area. It will allow the participants to get in touch with international experts in this ... | |||

Machine Learning Summer School (MLSS), Canberra 2010Machine Learning is a foundational discipline of the Information Sciences. It combines deep theory from areas as diverse as Statistics, Mathematics, Engineering, and Information Technology with many practical and relevant real life applications. The aim of the summer school is ... | |||

Machine Learning Summer School (MLSS), Chicago 2005Machine learning is a field focused on making machines learn to make predictions from examples. It combines elements of mathematics, computer science, and statistics with applications in biology, physics, engineering and any other area where automated prediction is necessary. This ... | |||

Machine Learning Summer School (MLSS), Chicago 2009The theme of this year's summer school is Theory and Practice of Computational Learning, to be held in conjunction with a research workshop on the same topic, during the period June 1-11, 2009 at International House, University of Chicago. The ... | |||

Machine Learning Summer School (MLSS), Kioloa 2008This school is suitable for all levels, both for people without previous knowledge in Machine Learning, and those wishing to broaden their expertise in this area. It will allow the participants to get in touch with international experts in this ... | |||

Machine Learning Summer School (MLSS), La Palma 2012The school addresses the following topics: Learning Theory, Kernel Methods, Bayesian Machine learning, Monte Carlo Methods , Bayesian Nonparametrics, Optimization, Graphical Models, Information theory and Dimensionality Reduction. Detailed information can be found here. | |||

Machine Learning Summer School (MLSS), Taipei 2006The two-week summer school program consists of about 40+ hours of lectures from 7/24 to 8/4. The late afternoon sessions will provide a chance for the participants to discuss the latest research problems with the invited speakers. Prior to the ... | |||

Machine Learning Summer School (MLSS), Tübingen 2003Machine Learning is a foundational discipline of the Information Sciences. It combines deep theory from areas as diverse as Statistics, Mathematics, Engineering, and Information Technology with many practical and relevant real life applications. The aim of the summer school is ... | |||

Machine Learning Summer School (MLSS), Tübingen 2007Machine Learning is a foundational discipline of the Information Sciences. It combines theory from areas as diverse as Statistics, Mathematics, Engineering, and Information Technology with many practical and relevant real life applications. The aim of the summer school is to ... | |||

## Meta project / project group: |
|||

Conferences26th AAAI Conference on Artificial Intelligence, Toronto 2012 I. konferenca slovenskih mladih raziskovalcev, podiplomskih in dodiplomskih študentov iz sveta in Slovenije »Kako od bega možganov do možganske obogatitve«, Ljubljana 2012 META-FORUM 2012 - A Strategy for Multilingual Europe, Brussels Alan ... | |||

## Debates: |
|||

John Langford, David McAllester, Yasemin Altun, Yann LeCun, Zoubin Ghahramani, Sanjoy Dasgupta, Partha Niyogi:
Lunch debate 23.5.2005 | |||

John Langford, Robert Schapire, Yann LeCun, Mikhail Belkin, Yoram Singer:
Lunch debate 24.5.2005 | |||

John Langford, Yasemin Altun, Yann LeCun, Yoram Singer, Robert Schapire, David McAllester:
Lunch debate 25.5.2005 | |||

John Langford, Rich Caruana, Mikhail Belkin, Adam Kalai:
Lunch debate 27.5.2005 | |||

## Introductions: |
|||

Douglas Aberdeen:
Introduction | |||

Marcus Hutter:
Introduction to the MLSS 2008 | |||

John Langford:
Welcome to Chicago, and a (brief!) introduction to machine learning | |||

## Invited talks: |
|||

Emmanuel Dupoux:
Early language bootstrappingHuman infants learn spontaneously and effortlessly the language(s) spoken in their environments, despite the extraordinary complexity of the task. In the past 30 years, tremendous progress has been made regarding the empirical investigation of the linguistic achievements of infants during ... | |||

Grace Wahba:
Examining the Relative Influence of Familial, Genetic, and Environmental Covariate Information in Flexible Risk ModelsWe present a novel method for examining the relative influence of familial, genetic and environmental covariate information in flexible nonparametric risk models. Our goal is investigating the relative importance of these three sources of information as they are associated with ... | |||

Stuart Geman:
Generative Models for Image AnalysisA probabilistic grammar for the grouping and labeling of parts and objects, when taken together with pose and part-dependent appearance models, constitutes a generative scene model and a Bayesian framework for image analysis. To the extent that the generative model ... | |||

Peter J. Bickel:
Inference for NetworksA great deal of attention has recently been paid to determining sub-communities on the basis of relations, corresponding to edges, between individuals, corresponding to vertices out of an unlabelled graph (Neman, SIAM Review 2003; Airoldi et al JMLR 2008; Leskovec ... | |||

Tomaso A. Poggio:
Learning in Hierarchical Architectures: from Neuroscience to Derived KernelsUnderstanding the processing of information in our cortex is a significant part of understanding how the brain works, arguably one of the greatest problems in science today. In particular, our visual abilities are computationally amazing: computer science is still far ... | |||

Emmanuel Candes:
Low-rank modelingInspired by the success of compressive sensing, the last three years have seen an explosion of research in the theory of low-rank modeling. By now, we have results stating that it is possible to recover certain low-rank matrices from a ... | |||

Emmanuel Candes:
Matrix Completion via Convex Optimization: Theory and AlgorithmsThis talk considers a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. In partially filled out surveys, for instance, we would like to infer the many missing entries. In the area ... | |||

Stéphane Mallat:
Sparse Representations from Inverse Problems to Pattern RecognitionSparse representations are at the core of many low-level signal processing procedures and are used by most pattern recognition algorithms to reduce the dimension of the search space. Structuring sparse representations fro pattern recognition applications requires taking into account invariants ... | |||

Gary L Miller:
Spectral Graph Theory, Linear Solvers and Applications We discuss the development of combinatorial methods for solving symmetric diagonally dominate linear systems. Over the last fifteen years the computer science community has made substantial progress in fast solvers for SDD systems. For general SDD systems the upper bound ... | |||

Herbert Edelsbrunner:
The Stability of the Contour of an Orientable 2-ManifoldThink of the view of the boundary of a solid shape as a projection of a 2-manifold to R^2. Its apparent contour is the projection of the critical points. Generalizing the projection to smooth mappings of a 2-manifold to R^2, ... | |||

## Interviews: |
|||

Bernhard Schölkopf:
Interview about past, present, future of MLSSIn this interview the Videolectures.Net team spoke to Bernhard Schölkopf at the MLSS 2007 in Tuebingen. We were interested how he sees the social part of the school, if he still attends school with the same enthusiasm, if the talks ... | |||

John Langford:
Short interviews MLSS05 Chicago by John Langford | |||

## Lectures: |
|||

Yoonkyung Lee:
A Bahadur Type Representation of the Linear Support Vector Machine and its Relative EfficiencyThe support vector machine has been used successfully in a variety of applications. Also on the theoretical front, its statistical properties including Bayes risk consistency have been examined rather extensively. Taking another look at the method, we investigate the asymptotic ... | |||

Manfred K. Warmuth:
A Bayesian Probability Calculus for Density MatricesOne of the main concepts in quantum physics is a density matrix, which is a symmetric positive definite matrix of trace one. Finite probability distributions can be seen as a special case when the density matrix is restricted to be ... | |||

Peter L. Bartlett:
AdaBoost is Universally ConsistentWe consider the risk, or probability of error, of the classifier produced by AdaBoost, and in particular the stopping strategy to be used to ensure universal consistency. (A classification method is universally consistent if the risk of the classifiers it ... | |||

Bernhard Schölkopf:
A discussion about ML | |||

Olivier Bousquet:
Advanced Statistical Learning TheoryThis set of lectures will complement the statistical learning theory course and focus on recent advances in the domain of classification. 1- PAC Bayesian bounds: a simple derivation, comparison with Rademacher averages. 2 - Local Rademacher complexity with classification loss, ... | |||

Peter Culicover:
Adventures with CamilleA computational simulation of a minimalist language learner | |||

John Langford:
Agnostic Active LearningThe great promise of active learning is that via interaction the number of samples required can be reduced to logarithmic in the number required for standard batch supervised learning methods. To achieve this promise, active learning must be able to ... | |||

Wouter Boomsma:
A Graphical Model of the Local Structure of Proteins | |||

Theodore Alexandrov:
A Kernel-based Nonlinear Approach for Time Series Forecast | |||

Markus Hegland:
Algorithms for Association Rules | |||

Steve Smale:
Algorithms for Learning and their EstimatesWe will try to give an elementary account of bounds for "regularized least squares" that reflects our current knowledge. The framework for the discussion is that of Reproducing Kernel Hilbert Spaces, with a regression point of view. As a corollary ... | |||

Ding-Xuan Zhou:
Analysis of Support Vector Machine Classification | |||

Mark Johnson:
An introduction to grammars and parsingFrom MJ: "The following is a fairly advanced summary of the material I'll be covering in my talk. Don't be dismayed if you find it hard to understand now; I hope that after my talk it will be much clearer! | |||

Elad Yom Tov:
An Introduction to Pattern Classification | |||

Adam Kowalczyk:
Anti-LearningThe Biological domain poses new challenges for statistical learning. In the talk we shall analyze and theoretically explain some counter-intuitive experimental and theoretical findings that systematic reversal of classifier decisions can occur when switching from training to independent test data ... | |||

Arun Sharma:
A Unified Approach to Deduction and Induction | |||

Mike Tipping:
Bayesian Inference: Principles and PracticeThe aim of this course is two-fold: to convey the basic principles of Bayesian machine learning and to describe a practical implementation framework. Firstly, we will give an introduction to Bayesian approaches, focussing on the advantages of probabilistic modelling, the ... | |||

Alexander J. Smola:
Bayesian Kernel Methods | |||

Zoubin Ghahramani:
Bayesian LearningBayes Rule provides a simple and powerful framework for machine learning. This tutorial will be organised as follows: 1. I will give motivation for the Bayesian framework from the point of view of rational coherent inference, and highlight the important ... | |||

Konrad Körding:
Bayesian modeling of action and perception and some other stuff | |||

Zoubin Ghahramani:
Bayesian Modelling | |||

Joshua B. Tenenbaum:
Baysian models and cognitive development | |||

Pierre Baldi:
Bioinformatics | |||

Adam Kowalczyk:
Bioinformatics Challenge: Learning in Very High Dimensions with Very Few SamplesDedicated machine learning procedures have already become an integral part of modern genomics and proteomics. However, these very high dimensional and low learning sample tasks often stretch these procedures well beyond natural boundaries of their applicability. A few such challenges ... | |||

Ron Meir:
Boosting | |||

Gunnar Rätsch:
Boosting | |||

Klaus-Robert Müller:
Brain Computer InterfacesBrain Computer Interfacing (BCI) aims at making use of brain signals for e.g. the control of objects, spelling, gaming and so on. This tutorial will first provide a brief overview of the current BCI research activities and provide details in ... | |||

Tim van Erven:
Catching Up Faster in Bayesian Model Selection and Model Averaging | |||

Mark Liberman:
Categorical Perception + Linear Learning = Shared CultureIn a group of entities who learn by observing one another's behavior, some simple assumptions about the nature of perception, the nature of individual beliefs, and the nature of learning lead naturally to collective convergence on a random set of ... | |||

Fernando Perez-Cruz:
Channel Coding with LDPC Codes | |||

Matthias Hein:
Cheeger Cuts and p-Spectral ClusteringSpectral clustering has become in recent years one of the most popular clustering algorithm. In this talk I discuss a generalized version of spectral clustering based on the second eigenvector of the graph p-Laplacian, a non-linear generalization of the graph ... | |||

Bernhard Schölkopf:
Closing remarks | |||

Marina Meila:
Clustering - An overviewClustering, or finding groups in data, is as old as machine learning itself, if not older. However, as more people use clustering in a variety of settings, the last few years we have brought unprecedented developments in this field. This ... | |||

DeLiang Wang:
Cocktail Party Problem as Binary ClassificationSpeech segregation, or the cocktail party problem, has proven to be extremely challenging. Part of the challenge stems from the lack of a carefully analyzed computational goal. While the separation of every sound source in a mixture is considered the ... | |||

Shimon Ullman:
Computational models of vision | |||

Andrew Blake:
Computer Vision | |||

Gabor Lugosi:
Concentration inequalities in machine learning | |||

Stéphane Boucheron:
Concentration Inequalities with Machine Learning Applications | |||

Tamal Dey:
Cut Locus and Topology from Point DataA cut locus of a point p in a compact Riemannian manifold M is defined as the set of points where minimizing geodesics issued from p stop being minimizing. It is known that a cut locus contains most of the ... | |||

Boaz Nadler:
Diffusion Maps, Spectral Clustering and Reaction Coordinates of Dynamical SystemsA central problem in data analysis is the low dimensional representation of high dimensional data, and the concise description of its underlying geometry and density. In the analysis of large scale simulations of complex dynamical systems, where the notion of ... | |||

Mark Girolami:
Diffusions and Geodesic Flows on Manifolds: The Differential Geometry of Markov Chain Monte CarloMarkov Chain Monte Carlo methods provide the most comprehensive set of simulation based tools to enable inference over many classes of statistical models. The complexity of many applications presents an enormous challenge for sampling methods motivating continual innovation in theory, ... | |||

Neil D. Lawrence:
Dimensionality Reduction | |||

Dilan Görür:
Dirichlet Process: Practical Course | |||

Wouter M. Koolen:
Discovering the Truth by Conducting Experiments | |||

Yoav Freund:
Drifting Games, Boosting and Online LearningDrifting games provide a new and useful framework for analyzing learning algorithms. In this talk I will present the framework and show how it is used to derive a new boosting algorithm, called RobustBoost and a new online prediction algorithm, ... | |||

Rich Caruana:
Empirical Comparisons of Learning Methods & Case StudiesDecision trees may be intelligible, but can they cut the mustard? Have SVMs replaced neural nets, or are neural nets still best for regression, and SVMs best for classification? Boosting maximizes a margin much like SVMs, but can boosting compete ... | |||

Vladimir Vapnik:
Empirical Inference | |||

Yann LeCun:
Energy-based models & Learning for Invariant Image Recognition | |||

Robi Polikar:
Ensemble Systems for Incremental and Nonstationary Learning | |||

Robert Ghrist:
Euler Calculus and Topological Data ManagementThis talk covers the basic of an integral calculus based on Euler characteristic, and its utility in data problems, particularly in aggregation of redundant data and inverse problems over networks. This calculus is a blend of integral-geometric and sheaf theoretic ... | |||

Phil Long:
Evidence Integration in BioinformaticsBiologists frequently use databases; for example, when a biologist encounters some unfamiliar proteins, s/he will use databases to get a preliminary idea of what is known about them. The databases can be often interpreted as lists of assertions. An example ... | |||

Alexander J. Smola:
Exponential Families in Feature SpaceIn this course I will discuss how exponential families, a standard tool in statistics, can be used with great success in machine learning to unify many existing algorithms and to invent novel ones quite effortlessly. In particular, I will show ... | |||

Alexander J. Smola:
Exponential Families in Feature SpaceIn this introductory course we will discuss how log linear models can be extended to feature space. These log linear models have been studied by statisticians for a long time under the name of exponential family of probability distributions. We ... | |||

S.V.N. Vishwanathan:
Exponential Families in Feature SpaceIn this introductory course we will discuss how log linear models can be extended to feature space. These log linear models have been studied by statisticians for a long time under the name of exponential family of probability distributions. We ... | |||

S.V.N. Vishwanathan:
Exponential Families in Feature Space - Part 5In this introductory course we will discuss how log linear models can be extended to feature space. These log linear models have been studied by statisticians for a long time under the name of exponential family of probability distributions. We ... | |||

S.V.N. Vishwanathan:
Exponential Families in Feature Space - Part 6In this introductory course we will discuss how log linear models can be extended to feature space. These log linear models have been studied by statisticians for a long time under the name of exponential family of probability distributions. We ... | |||

Dragos Datcu:
Facial expression recognition and emotion recognition from speechThe presentation tackles the problem of recognizing the emotions based on video and audio data analysis. A fully automatic facial expression recognition system is based on three components: face detection, facial characteristic point extraction and classification. Face detection is employed ... | |||

Ed Stabler:
Feasible Language LearningThis talk will consider how some recent models of feasible learning might apply to human language learning, with attention to how these results complement traditional linguistic perspectives. | |||

Joshua B. Tenenbaum:
Finding structure in data | |||

Antonio Galves:
Fingerprints of Rhthm in Natural LanguageThis talk reviews a list of recent results on the rhythmic classes hypothesis produced by the Tycho Brahe Project. I start with results on the rhythmic classification of speech data based on the speech sonority. Then, I address the question ... | |||

Daniel A. Spielman:
Fitting a Graph to Vector Data We ask "What is the right graph to fit to a set of vectors?" We propose one solution that provides good answers to standard Machine Learning problems, that has interesting combinatorial properties, and that we can compute efficiently. Joint work ... | |||

Steve Smale:
Foundations of Learning | |||

Garrett Mitchener:
Game Dynamics with Learning and Evolution of Universal GrammarI will present a model of language evolution, based on population game dynamics with learning. Specifically, we consider the case of two genetic vari- ants of universal grammar (UG), the heart of the human language faculty, assuming each admits two ... | |||

Carl Edward Rasmussen:
Gaussian Processes | |||

John Cunningham:
Gaussian Processes | |||

Edwin V. Bonilla:
Gaussian Processes | |||

Dilan Görür, Zoubin Ghahramani:
Gaussian Process: Practical Course | |||

John Langford:
Generalization boundsWhen a learning algorithm produces a classifier, a natural question to ask is "How well will it do in the future?" To make statements about the future given the past, some assumption must be made. If we make only an ... | |||

Rene Vidal:
Generalized Principal Component Analysis (GPCA)Data segmentation is usually though of as a chicken-and-egg problem. In order to estimate a mixture of models one needs to first segment the data, and in order to segment the data one needs to know the model parameters. Therefore, ... | |||

Frederic Chazal:
Geometric Inference for Probability DistributionData often comes in the form of a point cloud sampled from an unknown compact subset of Euclidean space. The general goal of geometric inference is then to recover geometric and topological features (Betti numbers, curvatures,...) of this subset from ... | |||

Nicol Schraudolph:
Gradient Methods for Machine LearningGradient methods locally optimize an unknown differentiable function, and thus provide the engines that drive much machine learning. Here we'll take a look under the hood, beginning with brief overview of classical gradient methods for unconstrained optimization: * Steepest descent, ... | |||

Zoubin Ghahramani:
Graph-based Semi-supervised Learning | |||

Zoubin Ghahramani:
Graphical modelsAn introduction to directed and undirected probabilistic graphical models, including inference (belief propagation and the junction tree algorithm), parameter learning and structure learning, variational approximations, and approximate inference. - Introduction to graphical models: (directed, undirected and factor graphs; conditional independence; ... | |||

Zoubin Ghahramani:
Graphical Models | |||

Christopher Bishop:
Graphical Models and Variational MethodsIn this course I will discuss how exponential families, a standard tool in statistics, can be used with great success in machine learning to unify many existing algorithms and to invent novel ones quite effortlessly. In particular, I will show ... | |||

Karen Livescu:
Graphical Models for Speech Recognition: Articulatory and Audio-Visual ModelsSince the 1980s, the main approach to automatic speech recognition has been using hidden Markov models (HMMs), in which each state corresponds to a phoneme or part of a phoneme in the context of the neighboring phonemes. Despite their crude ... | |||

Tibério Caetano:
Graphical Models for Structural Pattern RecognitionIn the "structural" paradigm for visual pattern recognition, or what some call "strong" pattern recognition, one is not satisfied with simply assigning a class label to an input object, but instead we aim at finding exactly which parts of the ... | |||

Martin J. Wainwright:
Graphical Models, Variational Methods, and Message-Passing | |||

Terry Caelli:
Graph Matching AlgorithmsGraph matching plays a key role in many areas of computing from computer vision to networks where there is a need to determine correspondences between the components (vertices and edges) of two attributed structures. In recent years three new approaches ... | |||

Wolfgang Maass:
How could networks of neurons learn to carry out probabilistic inference? | |||

Marcus Hutter:
How to predict with Bayes, MDL, and ExpertsMost passive Machine Learning tasks can be (re)stated as sequence prediction problems. This includes pattern recognition, classification, time-series forecasting, and others. Moreover, the understanding of passive intelligence also serves as a basis for active learning and decision making. In the ... | |||

Xiaochuan Pan:
How to Visualize the Unseeable | |||

Benyah Shaparenko:
Identifying Temporal Patterns and Key Players in Document CollectionsWe consider the problem of analyzing the development of a document collection over time without requiring meaningful citation data. Given a collection of timestamped documents, we formulate and explore the following two questions. First, what are the main topics and ... | |||

Jean-François Cardoso:
Independent Component AnalysisThe course provides an introduction to independent component analysis and source separation. We start from simple statistical principles; examine connections to information theory and to sparse coding; we give an overview of available algorithmics; we also show how several key ... | |||

Tom Griffiths:
Inferring structure from data | |||

Sanjoy Dasgupta:
Information GeometryThis tutorial will focus on entropy, exponential families, and information projection. We'll start by seeing the sense in which entropy is the only reasonable definition of randomness. We will then use entropy to motivate exponential families of distributions — which ... | |||

David Hawking:
Information Retrieval | |||

Thorsten Joachims:
Information Retrieval and Language TechnologyThe course will give an overview of how statistical learning can help organize and access information that is represented in textual form. In particular, it will cover tasks like text classification, information retrieval, information extraction, topic detection, and topic tracking. ... | |||

Thomas Hofmann:
Information Retrieval and Text MiningThis four hour course will provide an overview of applications of machine learning and statistics to problems in information retrieval and text mining. More specifically, it will cover tasks like document categorization, concept-based information retrieval, question-answering, topic detection and document ... | |||

Thomas Hofmann:
Information Retrieval and Text MiningThis four hour course will provide an overview of applications of machine learning and statistics to problems in information retrieval and text mining. More specifically, it will cover tasks like document categorization, concept-based information retrieval, question-answering, topic detection and document ... | |||

Nilesh Shah:
Intel Research : User Activity based Adaptive Power Management (APM)Adaptive Power Management (APM) for mobile computers attempts to reduce power consumption by placing components into low power states with low impact on "perceived" performance.The state of the art commercial solutions are timeout policies. Research in APM has focussed on ... | |||

Peter Orbanz:
Introduction to Bayesian Nonparametrics | |||

Gunnar Rätsch:
Introduction to bioinformaticsI will start by giving a general introduction into Bioinformatics, including basic biology, typical data types (sequences, structures, expression data and networks) and established analysis tasks. In the second part, I will discuss the problem of predictive sequence analysis with ... | |||

Gunnar Rätsch:
Introduction to BoostingThis course provides an introduction to theoretical and practical aspects of Boosting and Ensemble Learning. I will begin with a short description of the learning theoretical foundations of weak learners and their linear combination. Then we point out the useful ... | |||

Alexander J. Smola:
Introduction to kernel methodsThis lecture given by Mr. Smola is combined with Mr. Bernhard Schoelkopf and will encopass Part 1, Part 5, Part 6 of the complete lecture. Part 2, 3 and 4 of this lecture can be found here at Bernhard Schoelkopf's ... | |||

Bernhard Schölkopf:
Introduction to kernel methodsThis lecture given by Mr. Bernhard Schölkop is combined with Mr. Smola and will encompass Part 2, Part 3, Part 4 of the complete lecture. Part 1 , 5, 6 of this lecture can be found here at Alex Smola's ... | |||

Mikhail Belkin:
Introduction to Kernel Methods | |||

Bernhard Schölkopf:
Introduction to Kernel MethodsThe course will cover the basics of Support Vector Machines and related kerne methods: 1. Kernels and Feature Spaces 2. Large Margin Classification 3. Basic Ideas of Learning Theory 4. Support Vector Machines 5. Examples of Other Kernel Algorithms | |||

Partha Niyogi:
Introduction to Kernel Methods | |||

Alexander J. Smola:
Kernel MethodsIn this short course I will discuss exponential families, density estimation, and conditional estimators such as Gaussian Process classification, regression, and conditional random fields. The key point is that I will be providing a unified view of these estimation methods. ... | |||

Bernhard Schölkopf:
Kernel MethodsThe course will cover some basic ideas of learning theory, elements of the theory of reproducing kernel Hilbert spaces, and some machine learning algorithms that build upon this. | |||

Felix A. Wichmann:
Kernel Methods and Perceptual Classification | |||

Kenji Fukumizu:
Kernel Methods for Dependence and Causality | |||

Matthias O. Franz:
Kernel Methods for Higher Order Image StatisticsThe conditions under which natural vision systems evolved show statistical regularities determined both by the environment and by the actions of the organism. Many aspects of biological vision can be understood as evolutionary adaptations to these regularities. This is demonstrated ... | |||

Jean-Philippe Vert:
Kernel Methods in Computational BiologyMany problems in computational biology and chemistry can be formalized as classical statistical problems, e.g., pattern recognition, regression or dimension reduction, with the caveat that the data are often not vectors. Indeed objects such as gene sequences, small molecules, protein ... | |||

Marco Cuturi:
Kernels on histograms through the transportation polytopeFor two integral histograms and of equal sum, the Monge-Kantorovich distance MK(r,c) between r and c parameterized by a d × d cost matrix T is the minimum of all costs <F,T> taken over matrices F of the transportation polytope ... | |||

Dilan Görür:
Kingman's Coalescent for Hierarchical Representations | |||

Martin J. Wainwright:
L1-based relaxations for sparsity recovery and graphical model selection in the high-dimensional regimeThe problem of estimating a sparse signal embedded in noise arises in various contexts, including signal denoising and approximation, as well as graphical model selection. The natural optimization-theoretic formulation of such problems involves "norm" constraints (i.e., penalties on the number ... | |||

Nick Chater:
Language acquisition and Kolmogorov complexity: Why is language acquisition possible | |||

Hsuan-Tien Lin:
Large-Margin Thresholded Ensembles for Ordinal RegressionWe propose a thresholded ensemble model for ordinal regression problems. The model consists of a weighted ensemble of confidence functions and an ordered vector of thresholds. Using such a model, we could theoretically and algorithmically reduce ordinal regression problems to ... | |||

Tong Zhang:
Large Scale Ranking Problem: some theoretical and algorithmic issuesThe talk is divided into two parts. The first part focuses on web-search ranking, for which I discuss training relevance models based on DCG (discounted cumulated gain) optimization. Under this metric, the system output quality is naturally determined by the ... | |||

Stephan Canu:
Learning and Regularization Using non Positive Kernels | |||

Yali Amit:
Learning Deformable ModelsIt is widely recognized that the fundamental building block in high level computer vision is the deformable template, which represents realizations of an object class in the image as noisy geometric instantiations of an underlying model. The instantiations typically come ... | |||

Guillermo Sapiro:
Learning Dictionaries for Image Analysis and SensingSparse representations have recently drawn much attention from the signal processing and learning communities. The basic underlying model consist of considering that natural images, or signals in general, admit a sparse decomposition in some redundant dictionary. This means that we ... | |||

Yann LeCun:
Learning Feature Hierarchies | |||

John Lloyd:
Learning from Structured Data | |||

Andreas Dengel:
Learning Mental Associations as a means to build Organizational MemoriesOffice workspace reveals collections of documents structured along directories, bookmarks and email folders. The respective taxonomies represent conceptual implicit knowledge generated by the user about his/her role, tasks, and interests. Starting from that, learning methods can be applied to generate ... | |||

Yasemin Altun:
Learning on Structured DataDiscriminative learning framework is one of the very successful fields of machine learning. The methods of this paradigm, such as Boosting, and Support Vector Machines have significantly advanced the state-of-the-art for classification by improving the accuracy and by increasing the ... | |||

David McAllester:
Learning on Structured DataDiscriminative learning framework is one of the very successful fields of machine learning. The methods of this paradigm, such as Boosting, and Support Vector Machines have significantly advanced the state-of-the-art for classification by improving the accuracy and by increasing the ... | |||

Sayan Mukherjee:
Learning patterns in omic data: applications of learning theory | |||

Rao Kambhampati:
Learning techniques in PlanningIn this lecture, I aim to provide an overview of the learning techniques that have found use in automated planning. Unlike most the clustering and classification tasks that have dominated the recent machine learning literature, learning in planning requires handling ... | |||

Brian Skyrms:
Learning to Signal | |||

Ding-Xuan Zhou:
Learning variable covariances via gradients | |||

Bernhard Schölkopf:
Learning with KernelsThe course will cover the basics of Support Vector Machines and related kernel methods. # Kernel and Feature Spaces # Large Margin Classification # Basic Ideas of Learning Theory # Support Vector Machines # Other Kernel Algorithms | |||

Bernhard Schölkopf:
Learning with KernelsThe Course on Learning with Kernels covers * Elements of Statistical Learning Theory * Kernels and feature spaces * Support vector algorithms and other kernel methods * Applications | |||

Oliver Kohlbacher:
Lost in Translation -- Solving biological problems with machine learningWe demonstrate the application of machine learning methods to problems from biology, chemistry, and pharmacy, nameley the prediction of protein subcellular localization, prediction of chromatiographic separation of oligo nucleotides, and the prediction of percutaneous drug absorption. For these examples, we ... | |||

László Györfi:
Machine learning and finance | |||

Alexander Clark:
Machine learning and the cognitive science of natural language | |||

Dimitris Achlioptas:
Machine Learning Flavor of Random Matrices | |||

Thore Graepel:
Machine Learning for GamesThe course gives an introduction to the application of machine learning techniques to games. The course will consist of two parts, part I dealing with computer/video games, part II dealing with traditional board/strategy games. Alongside, I will introduce necessary background ... | |||

Mark Hasegawa-Johnson:
Machine Learning in Acoustic Signal ProcessingThis tutorial presents a framework for understanding and comparing applications of pattern recognition in acoustic signal processing. Representative applications will be delimited by two binary features: (1) regression vs. (2) classification (inferred variables are continuous vs. discrete), (A) instantaneous vs. ... | |||

Alexander Zien:
Machine Learning in Bioinformatics | |||

John Langford:
Machine Learning ReductionsThere are several different classification problems commonly encountered in real world applications such as 'importance weighted classification', 'cost sensitive classification', 'reinforcement learning', 'regression' and others. Many of these problems can be related to each other by simple machines (reductions) that ... | |||

Tony Jebara:
MAP Estimation with Perfect GraphsEfficiently finding the maximum a posteriori (MAP) configuration of a graphical model is an important problem which is often implemented using message passing algorithms and linear programming. The optimality of such algorithms is only well established for singly-connected graphs such ... | |||

Christian P. Robert:
Markov Chain Monte Carlo Methods0. A fundamental theorem of simulation 1. Markov chain basics 2. Slice sampling 3. Gibbs sampling 4. Metropolis-Hastings algorithms 5. Variable dimension models and reversible jump MCMC 6. Perfect sampling 7. Adaptive MCMC and population Monte Carlo | |||

Markus Hegland:
MarkusSparse Grid MethodsThe search for interesting variable stars, the discovery of relations between geomorphological properties, satellite observations and mineral concentrations, and the analysis of biological networks all require the solution of a large number of complex learning problems with large amounts of ... | |||

Arthur Gretton:
Measures of Statistical DependenceA number of important problems in signal processing depend on measures of statistical dependence. For instance, this dependence is minimised in the context of instantaneous ICA, in which linearly mixed signals are separated using their (assumed) pairwise independence from each ... | |||

Ferenc Huszar:
Modeling Human Category Learning | |||

Nick Chater:
Models of Human decision-making | |||

Tom Griffiths:
Monte Carlo and the mind | |||

Christophe Andrieu:
Monte Carlo Simulation methodsThe course provides an introduction to independent component analysis and source separation. We start from simple statistical principles; examine connections to information theory and to sparse coding; we give an overview of available algorithmics; we also show how several key ... | |||

Nathan Srebro:
More Data Less Work: Runtime As A Monotonically Decreasing Function of Data Set SizeWe are used to studying runtime as an increasing function of the data set size, and are happy when this increase is not so bad (e.g. when the runtime increases linearly, or even polynomiall, with the data set size). Traditional ... | |||

Mauro Maggioni:
Multiscale analysis on graphsAnalysis on graphs has recently been shown to lead to powerful algorithms in learning, in particular for regression, classification andclustering. Eigenfunctions of the Laplacian on a graph are a natural basis for analyzing functions on a graph, as we have ... | |||

Ronald Coifman:
Multiscale Geometry and Harmonic Analysis of Data Bases We describe a method for geometrization of databases such as, questionnaires, or lists of sensor outputs. Interlacing multiscale diffusion geometries of rows and columns of a data matrix, results in a pair of language ontologies which are mutually supportive (certain ... | |||

Sam Roweis:
Neighbourhood Components AnalysisSay you want to do K-Nearest Neighbour classification. Besides selecting K, you also have to chose a distance function, in order to define "nearest". I'll talk about a novel method for *learning* -- from the data itself -- a distance ... | |||

Konrad Körding:
Neuroscience, cognitive science and machine learning | |||

Zoubin Ghahramani:
Nonparametric Bayesian Modelling | |||

Avrim Blum:
On a Theory of Similarity Functions for Learning and Clustering Kernel methods have become powerful tools in machine learning. They perform well in many applications, and there is also a well-developed theory of what makes a given kernel useful for a given learning problem. However, this theory requires viewing kernels ... | |||

Maria-Florina Balcan:
On Finding Low Error ClusteringsThere has been substantial work on approximation algorithms for clustering data under distance-based objective functions such as k-median, k-means, and min-sum objectives. This work is fueled in part by the hope that approximating these objectives well will indeed yield more ... | |||

Nicolò Cesa-Bianchi:
Online Learning | |||

Peter L. Bartlett:
Online Learning | |||

Manfred K. Warmuth:
Online Learning and Bregman DivergencesL 1: Introduction to Online Learning (Predicting as good as the best expert, Predicting as good as the best linear combination of experts, Additive versus multiplicative family of updates) L 2: Bregman divergences and Loss bounds (Introduction to Bregman divergences, ... | |||

Gunnar Rätsch:
Online Learning and Bregmann Divergences | |||

Adam Kalai:
Online Learning and Game TheoryWe consider online learning and its relationship to game theory. In an online decision-making problem, as in Singer's lecture, one typically makes a sequence of decisions and receives feedback immediately after making each decision. As far back as the 1950's, ... | |||

Yoram Singer:
Online Learning with KernelsOnline learning is concerned with the task of making decisions on-the-fly as observations are received. We describe and analyze several online learning tasks through the same algorithmic prism. We start with online binary classification and show how to build simple ... | |||

Jyrki Kivinen:
Online Loss Bounds | |||

Vladimir Temlyakov:
On Optimal Estimators in Learning TheoryThis talk addresses some problems of supervised learning in the setting formulated by Cucker and Smale. Supervised learning, or learning from examples, refers to a process that builds on the base of available data of inputs xi and outputs yi, ... | |||

Michael I. Jordan:
On Surrogate Loss Functions, f-Divergences and Decentralized DetectionIn 1951, David Blackwell published a seminal paper - widely cited in economics - in which a link was established between the risk based on 0-1 loss and a class of functionals known as f-divergences. The latter functionals have since ... | |||

Peter J. Bickel:
On the Borders of Statistics and Computer ScienceMachine learning in computer science and prediction and classification in statistics are essentially equivalent fields. I will try to illustrate the relation between theory and practice in this huge area by a few examples and results. In particular I will ... | |||

Felipe Cucker:
On the evolution of languages | |||

Bernhard Schölkopf:
Opening of the 9th Machine Learning Summer School | |||

Stephen J. Wright:
Optimization Algorithms in Support Vector Machines This talk presents techniques for nonstationarity detection in the context of speech and audio waveforms, with broad application to any class of time series that exhibits locally stationary behavior. Many such waveforms, in particular information-carrying natural sound signals, exhibit a ... | |||

Sathiya Keerthi:
Optimization for Kernel MethodsOptimization methods play a crucial role in kernel methods such as Support Vector Machines and Kernel Logistic Regression. In a variety of scenarios, different optimization algorithms are better suited than others. The aims of the six lectures in this topic ... | |||

Robert Vanderbei:
Optimization: Theory and AlgorithmsThe course will cover linear, convex, and parametric optimization. In each of these areas, the role of duality will be emphasized as it informs the design of efficient algorithms and provides a rigorous basis for determining optimality. Various versions of ... | |||

John Shawe-Taylor:
PAC-Bayes Analysis: Background and Applications | |||

Alaa Sagheer:
Piecewise 1-Dim Self Organizing Map P1D-SOM | |||

Douglas Aberdeen:
Policy-gradient Reinforcement Learning | |||

Tong Zhang:
Predictive methods for Text miningI will give a general overview of using prediction methods in text mining applications, including text categorization, information extraction, summarization, and question answering. I will then discuss some of the more advanced issues encountered in real applications such as structured ... | |||

Ioana Cosma:
Probabilistic Counting Algorithms for Massive Data Streams | |||

David W. Hogg:
Probabilistic decision-making, data analysis, and discovery in astronomyAstronomy is a prime user community for machine learning and probabilistic modeling. There are very large, public data sets (mostly but not entirely digital imaging), there are simple but effective models of many of the most important phenomena (stars, quasars, ... | |||

Sam Roweis:
Probabilistic Graphical ModelsMy lectures will cover the basics of graphical models, also known as Bayes(ian) (Belief) Net(work)s. We will cover the basic motivations for using probabilities to represent and reason about uncertain knowledge in machine learning, and introduce graphical models as a ... | |||

Mark Johnson:
Probabilistic Models for Computational Linguistics | |||

Chia-Yu Su:
Protein Subcellular Localization Prediction Based on Compartment-Specific Biological FeaturesPrediction of subcellular localization of proteins is important for genome annotation, protein function prediction, and drug discovery. We present a prediction method for Gram-negative bacteria that uses ten one-versus-one support vector machine (SVM) classifiers, where compartment-specific biological features are selected ... | |||

Kai Zhang:
Prototype Vector Machine for Large Scale Semi-Supervised LearningPractical data analysis and mining rarely falls exactly into the supervised learning scenario. Rather, the growing amount of unlabelled data from various scientiﬁc domains poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computational intensiveness of ... | |||

Manfred K. Warmuth:
Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the DimensionWe design an on-line algorithm for Principal Component Analysis. The instances are projected into a probabilistically chosen low dimensional subspace. The total expected quadratic approximation error equals the total quadratic approximation error of the best subspace chosen in hindsight plus ... | |||

Tingfan Wu:
Ranking by Stealing Human CyclesRanking objects is a challenging task for machines. The main difficulty is that some characteristics of interest lack objective criteria. As the Internet becomes more widely used, it is possible to integrate the human capability of evaluating unmeasurable properties with ... | |||

Chih-Jen Lin:
Ranking Individuals by Group ComparisonsWe discuss the problem of ranking individuals from their group competition results. Many real-world problems are of this type. For example, ranking players from team games is important in some sports. In machine learning, this is closely related to multi-class ... | |||

Nicol Schraudolph:
Rapid Stochastic Gradient Descent: Accelerating Machine LearningThe incorporation of online learning capabilities into real-time computing systems has been hampered by a lack of efficient, scalable optimization algorithms for this purpose: second-order methods are too expensive for large, nonlinear models, conjugate gradient does not tolerate the noise ... | |||

Elchanan Mossel:
Recent Progress in Combinatorial Statistics I will discuss some recent progress in combinatorial statistics. In particular, I will describe progress in the areas of reconstructing graphical models, ML estimation of the Mallows model and diagnostics of MCMC. | |||

Joachim Weickert:
Regularisation in Image Analysis | |||

Jean-Philippe Vert:
Regularization of Kernel Methods by Decreasing the Bandwidth of the Gaussian Kernel | |||

Douglas Aberdeen:
Reinforcement LearningReinforcement learning is about learning good control policies given only weak performance feedback: occasional scalar rewards that might be delayed from the events that led to good performance. Reinforcement learning inherently deals with feedback systems rather than (data, class) data ... | |||

Peter L. Bartlett:
Reinforcement Learning | |||

John Langford:
Reinforcement Learning TheoryThe tutorial is on several new pieces of Reinforcement learning theory developed in the last 7 years. This includes: 1. Sample based analysis of RL including E3 and sparse sampling. 2. Generalization based analysis of RL including conservative policy iteration ... | |||

Satinder Singh:
Reinforcement learning: Tutorial + Rethinking State, Action & Reward | |||

Marcus Frean:
Restricted Boltzmann Machines and Deep Belief Nets | |||

Bin Yu:
Seeking Interpretable Models for High Dimensional DataExtracting useful information from high-dimensional data is the focus of today's statistical research and practice. After broad success of statistical machine learning on prediction through regularization, interpretability is gaining attention and sparsity has been used as its proxy. With the ... | |||

Anastasia Krithara:
Semi-supervised Learning for Text Classification | |||

Partha Niyogi:
Semi-supervised Learning, Manifold Methods | |||

Mikhail Belkin:
Semi-supervised Learning, Manifold Methods | |||

Arnaud Doucet:
Sequential Monte Carlo methods Parts 4 and 5 of this lecture are presented in Manuel Davy's "Sequential Monte Carlo methods continued" | |||

Manuel Davy:
Sequential Monte Carlo methods continuedParts 1, 2 and 3 of this lecture are presented in Arnaud Doucet's "Sequential Monte Carlo methods " | |||

Manuel Davy:
Signal Processing | |||

Maya Gupta:
Similarity-Based Classifiers: Problems and Solutions Similarity-based learning assumes one is given similarities between samples to learn from, and can be considered a special case of graph-based learning where the graph is given and fully-connected. Such problems arise frequently in computer vision, bioinformatics, and problems involving ... | |||

Ingo Steinwart:
Some Aspects of Learning Rates for SVMsWe present some learning rates for support vector machine classification. In particular we discuss a recently proposed geometric noise assumption which allows to bound the approximation error for Gaussian RKHSs. Furthermore we show how a noise assumption proposed by Tsybakov ... | |||

Chris Burges:
Some Mathematical Tools for Machine LearningThese are lectures on some fundamental mathematics underlying many approaches and algorithms in machine learning. They are not about particular learning algorithms; they are about the basic concepts and tools upon which such algorithms are built. Often students feel intimidated ... | |||

Gunnar Rätsch:
Splice form prediction using Machine LearningAccurate ab initio gene finding is still a major challenge in computational biology. We employ state-of-the-art machine learning techniques based on Hidden Semi-Markov-SVMs to assay and improve the accuracy of genome annotations. We applied our system, called mSplicer, on the ... | |||

Peter McCullagh:
Statistical Classification and Cluster ProcessesAfter an introduction to the notion of an exchangeable random partition, we continue with a more detailed discussion of the Ewens process and some of its antecedents. The concept of an exchangeable cluster process will be described, the main example ... | |||

Olivier Bousquet:
Statistical learning theory - Learning Theory: Foundations and Goals - Learning Bounds: Ingredients and Results - Implications: What to conclude from bounds | |||

Olivier Bousquet:
Statistical Learning TheoryThis course will give a detailed introduction to learning theory with a focus on the classification problem. It will be shown how to obtain (pobabilistic) bounds on the generalization error for certain types of algorithms. The main themes will be: ... | |||

John Shawe-Taylor:
Statistical Learning Theory | |||

Shahar Mendelson:
Statistical Learning Theory and Empirical Processes | |||

Uwe D. Hanebeck:
Stochastic Information Processing in Sensor Networks: Challenges, Some Solutions, and Open Problems | |||

Léon Bottou:
Stochastic Learning | |||

Students performing "Easy" on the last day of the MLSS | |||

Ryota Tomioka:
Supervised Learning on Matrices with the Dual Spectral Regularization | |||

Bernhard Schölkopf:
Support Vector Machines and Kernels | |||

Jon David Patrick:
Text CategorizationThis course will cover the principal topics important to creating a working text categorization system. It will focus on the components of such a system and processes required to create it based on the practical experiences of the Scamseek project. ... | |||

Roman Klinger:
Text Mining in Biological Texts | |||

Cynthia Rudin:
The Dynamics of AdaBoostOne of the most successful and popular learning algorithms is AdaBoost, which is a classification algorithm designed to construct a "strong" classifier from a "weak" learning algorithm. Just after the development of AdaBoost nine years ago, scientists derived margin- based ... | |||

Robert Schapire:
Theory and Applications of BoostingBoosting is a general method for producing a very accurate classification rule by combining rough and moderately inaccurate "rules of thumb". While rooted in a theoretical framework of machine learning, boosting has been found to perform quite well empirically. This ... | |||

Jochen Garcke:
The Sparse Grid MethodThe sparse grid method is a special discretization technique, which allows to cope with the curse of dimensionality to some extent. It is based on a hierarchical basis and a sparse tensor product decompositon. Sparse grids have been successfully used ... | |||

Marina Meila:
The stability of a good clusteringIf we have found a "good" clustering C of data set X, can we prove that C is not far from the (unknown) best clustering C* of this data set? Perhaps surprisingly, the answer to this question is sometimes yes. ... | |||

Dmitry Pechyony:
Transductive Rademacher Complexity and its Applications | |||

Robert Nowak:
Trees for Regression and ClassificationTree models are widely used for regression and classification problems, with interpretability and ease of implementation being among their chief attributes. Despite the widespread use tree models, a comprehensive theoretical analysis of their performance has only begun to emerge in ... | |||

Han-Shen Huang:
Triple jump acceleration for the EM algorithm and its extrapolation-based variantsThe Aitken's acceleration is one of the most commonly used method to speed up the fixed-point iteration computation, including the EM algorithm. However, it requires to compute or approximate the Jacobian of the EM mapping matrix, which can be intractable ... | |||

Alexander J. Smola:
Unifying Divergence Minimization and Statistical Inference via Convex DualityWe unify divergence minimization and statistical inference by means of convex duality. In the process of doing so, we prove that the dual of approximate maximum entropy estimation is maximum a posteriori estimation. Moreover, our treatment leads to stability and ... | |||

Peter Grünwald:
Universal Modeling: Introduction to modern MDLWe give a tutorial introduction to the *modern* Minimum Description Length (MDL) Principle, taking into account the many refinements and developments that have taken place in the 1990s. These do not seem to be widely known outside the information theory ... | |||

David McAllester:
Unsupervised Learning for Stereo Vision We consider the problem of learning to estimate depth from stereo image pairs. This can be formulated as unsupervised learning - the training pairs are not labeled with depth. We have formulated an algorithm which maximizes conditional likelihood the left ... | |||

Alexander J. Smola:
Unsupervised Learning with Kernels | |||

Stephen Smale:
Vision and Hodge TheoryA general mathematical Hodge theory will be presented together with its relationship to spaces of images. | |||

David McAllester, John Langford:
Welcome | |||

Amit Singer:
What Do Unique Games, Structural Biology and the Low-Rank Matrix Completion Problem Have In CommonWe will formulate several data-driven applications as MAX2LIN and d-to-1 games, and show how to (approximately) solve them using efficient spectral and semidefinite program relaxations. The relaxations perform incredibly well in the presence of a large number of outlier measurements ... | |||

Neil D. Lawrence:
What is Machine Learning? | |||

## Opening: |
|||

Steve Smale, Mikhail Belkin:
Welcome Speech at the MLSS 2009 | |||

## Tutorials: |
|||

Sanjoy Dasgupta:
Analysis of Clustering ProceduresClustering procedures are notoriously short on rigorous guarantees. In this tutorial, I will cover some of the types of analysis that have been applied to clustering, and emphasize open problems that remain. Part I. Approximation algorithms for clustering Two popular ... | |||

Emmanuel Candes:
An Overview of Compressed Sensing and Sparse Signal Recovery via L1 MinimizationIn many applications, one often has fewer equations than unknowns. While this seems hopeless, the premise that the object we wish to recover is sparse or nearly sparse radically changes the problem, making the search for solutions feasible. This lecture ... | |||

Tom Minka:
Approximate Inference | |||

Peter Green:
Bayesian InferenceInference is the process of discovering from data about mechanisms that may have caused or generated that data, or at least explain it. The goals are varied - perhaps simply predicting future data, or more ambitiously drawing conclusions about scientific ... | |||

Carl Edward Rasmussen:
Bayesian inference and Gaussian processes | |||

Yee Whye Teh:
Bayesian NonparametricsMachine learning researchers often have to contend with issues of model selection and model fitting in the context of large complicated models and sparse data. The idea which I am pushing for in this project is that these can be ... | |||

Michael I. Jordan:
Bayesian or Frequentist, Which Are You? | |||

Chris Watkins:
Behavioural Learning: Inspiration from Nature? | |||

Robert Schapire:
BoostingBoosting is a general method for producing a very accurate classification rule by combining rough and moderately inaccurate "rules of thumb." While rooted in a theoretical framework of machine learning, boosting has been found to perform quite well empirically. This ... | |||

Vladimir Koltchinskii:
Bounding Excess Risk in Machine LearningWe will discuss a general approach to the problem of bounding the excess risk of learning algorithms based on empirical risk minimization (possibly penalized). This approach has been developed in the recent years by several authors (among others: Massart; Bartlett, ... | |||

Phil Dawid:
Causality | |||

Nick Chater:
Cognitive science for machine learning 1:What is cognitive science? | |||

Nick Chater:
Cognitive science for machine learning 2: Empirical methods | |||

Tom Griffiths:
Cognitive science for machine learning 3: Models and theories in cognitive science | |||

Andrew Blake:
Computer Vision | |||

Rao Kotagiri:
Contrast Data Mining: Methods and Applications The ability to distinguish, differentiate and contrast between different datasets is a key objective in data mining. Such an ability can assist domain experts to understand their data, and can help in building classification models. His presentation will introduce the ... | |||

Lieven Vandenberghe:
Convex OptimizationThe lectures will provide an introduction to the theory and applications of convex optimization. The emphasis will be on results useful for convex modeling, i.e., recognizing and formulating convex optimization problems in practice. - The first lecture will introduce some ... | |||

Lieven Vandenberghe:
Convex Optimization | |||

Lieven Vandenberghe:
Convex Optimization The lectures will give an introduction to the theory and applications of convex optimization, and an overview of recent developments in algorithms. The first lecture will cover the basics of convex analysis, focusing on the results that are most useful ... | |||

Geoffrey E. Hinton:
Deep Belief Networks | |||

Volker Tresp:
Dirichlet Processes and Nonparametric Bayesian ModellingBayesian modeling is a principled approach to updating the degree of belief in a hypothesis given prior knowledge and given available evidence. Both prior knowledge and evidence are combined using Bayes' rule to obtain the a posterior hypothesis. In most ... | |||

Yee Whye Teh:
Dirichlet Processes: Tutorial and Practical CourseThe Bayesian approach allows for a coherent framework for dealing with uncertainty in machine learning. By integrating out parameters, Bayesian models do not suffer from overfitting, thus it is conceivable to consider models with infinite numbers of parameters, aka Bayesian ... | |||

Alexander J. Smola:
Exponential FamiliesIn this introductory course we will discuss how log linear models can be extended to feature space. These log linear models have been studied by statisticians for a long time under the name of exponential family of probability distributions. We ... | |||

Marcus Hutter:
Foundations of Machine LearningMachine learning is usually taught as a bunch of methods that can solve a bunch of problems (see above). The second part of the tutorial takes a step back and asks about the foundations of machine learning, in particular the ... | |||

Peter Orbanz:
Foundations of Nonparametric Bayesian Methods | |||

Carl Edward Rasmussen:
Gaussian Processes | |||

Partha Niyogi, Mikhail Belkin:
Geometric Methods and Manifold Learning | |||

Zoubin Ghahramani:
Graphical Models | |||

Yair Weiss:
Graphical Models and ApplicationsCompressed sensing is a recent set of mathematical results showing that sparse signals can be exactly reconstructed from a small number of linear measurements. Interestingly, for ideal sparse signals with no measurement noise, random measurements allow perfect reconstruction while measurements ... | |||

Martin J. Wainwright:
Graphical Models and message-passing algorithms | |||

Aapo Hyvärinen:
Independent Component AnalysisIn independent component analysis (ICA), the purpose is to linearly decompose a multidimensional data vector into components that are as statistically independent as possible. For nongaussian random vectors, this decomposition is not equivalent to decorrelation as is done by principal ... | |||

Tibério Caetano:
Inference in Graphical ModelsThis short course will cover the basics of inference in graphical models. It will start by explaining the theory of probabilistic graphical models, including concepts of conditional independence and factorisation and how they arise in both Markov random fields and ... | |||

Thomas Hofmann:
Information Retrieval | |||

David MacKay:
Information Theory | |||

Christopher Bishop:
Introduction To Bayesian Inference | |||

Silvia Chiappa:
Introduction to Graphical Models | |||

Olivier Bousquet:
Introduction to Learning TheoryThe goal of this course is to introduce the key concepts of learning theory. It will not be restricted to Statistical Learning Theory but will mainly focus on statistical aspects. Instead of giving detailed proofs and precise statements, this course ... | |||

Csaba Szepesvári:
Introduction to Reinforcement LearningThe tutorial will introduce Reinforcement Learning, that is, learning what actions to take, and when to take them, so as to optimize long-term performance. This may involve sacrificing immediate reward to obtain greater reward in the long-term or just to ... | |||

Marcus Hutter:
Introduction to Statistical Machine LearningThe first part of his tutorial provides a brief overview of the fundamental methods and applications of statistical machine learning. The other speakers will detail or built upon this introduction. Statistical machine learning is concerned with the development of algorithms ... | |||

Bernhard Schölkopf:
Kernel Methods | |||

Bernhard Schölkopf:
Kernel MethodsThe course will start with basic ideas of machine learning, followed by some elements of learning theory. It will also introduce positive definite kernels and their associated feature spaces, and show how to use them for kernel mean embeddings, SVMs, ... | |||

Alexander J. Smola:
Kernel methods and Support Vector MachinesThe tutorial will introduce the main ideas of statistical learning theory, support vector machines, and kernel feature spaces. This includes a derivation of the support vector optimization problem for classification and regression, the v-trick, various kernels and an overview over ... | |||

John Shawe-Taylor:
Kernel Methods and Support Vector MachinesKernel methods have become a standard tool for pattern analysis during the last fifteen years since the introduction of support vector machines. We will introduce the key ideas and indicate how this approach to pattern analysis enables a relatively easy ... | |||

Wray Buntine:
Latent Variable Models for Document AnalysisWray Buntine will consider various problems in document analysis (named entity recognition, natural language parsing, information retrieval), and look at various probabilistic graphical models and algorithms for addressing the problem. This will not be an extensive coverage of information extraction ... | |||

Simon Lucey:
Learning in Computer VisionThis tutorial he will cover some of the core fundamentals in vision and demonstrate how they can be interpreted in terms of machine learning fundamentals. Unbeknownst to most researchers in the field of machine learning, the fundamentals of object registration ... | |||

John Shawe-Taylor:
Learning Theory | |||

Nicolò Cesa-Bianchi:
Learning Theory: statistical and game-theoretic approachesThe theoretical foundations of machine learning have a double nature: statistical and game-theoretic. In this course we take advantage of both paradigms to introduce and investigate a number of basic topics, including mistake bounds and risk bounds, empirical risk minimization, ... | |||

Neil D. Lawrence:
Learning with Probabilities | |||

Joshua B. Tenenbaum:
Machine Learning and Cognitive Science | |||

Neil D. Lawrence:
Machine learning for cognitive science 1: What is machine learning? | |||

Bernhard Schölkopf:
Machine learning for cognitive science 2: Bayesian methods and statistical learning theory | |||

Bernhard Schölkopf:
Machine learning for cognitive science 3: Kernel methods and Bayesian methods | |||

Christfried Webers:
Machine Learning LaboratoryThe first laboratory has not been recorded but has featured some hands on experiments with Elefant (http://elefant.developer.nicta.com.au/) mainly concentrating on installing, using, and developing machine learning algorithms within the Elefant framework. We will walk through examples of implementing a simple ... | |||

S.V.N. Vishwanathan:
Machine Learning LaboratoryThe first laboratory has not been recorded but has featured some hands on experiments with Elefant (http://elefant.developer.nicta.com.au) mainly concentrating on installing, using, and developing machine learning algorithms within the Elefant framework. We will walk through examples of implementing a simple ... | |||

Sam Roweis:
Machine Learning, Probability and Graphical Models | |||

Iain Murray:
Markov Chain Monte Carlo | |||

Nick Chater:
Models and theories in cognitive science (Part 2) | |||

Arnaud Doucet:
Monte Carlo MethodsWe will first review the Monte Carlo principle and standard Monte Carlo methods including rejection sampling, importance sampling and standard Markov chain Monte Carlo (MCMC) methods. We will then discuss more advanced MCMC methods such as adaptive MCMC methods and ... | |||

Nando de Freitas:
Monte Carlo Simulation for Statistical Inference, Model Selection and Decision Making The first part of his course will consist of two presentations. In the first presentation, he will introduce fundamentals of Monte Carlo simulation for statistical inference, with emphasis on algorithms such as importance sampling, particle filtering and smoothing for dynamic ... | |||

Yee Whye Teh:
Nonparametric Bayesian Models | |||

Avrim Blum:
Online Learning, Regret Minimization, and Game TheoryThe first part of tha tutorial will discuss adaptive algorithms for making decisions in uncertain environments (e.g., what route should I take to work if I have to decide before I know what traffic will like today?) and connections to ... | |||

Simon Godsill:
Particle Filters | |||

Peter L. Bartlett:
Pattern Classification and Large Margin ClassifiersThese lectures will provide an introduction to the theory of pattern classification methods. They will focus on relationships between the minimax performance of a learning system and its complexity. There will be four lectures. The first will review the formulation ... | |||

Felix A. Wichmann:
Psychophysical Methods, Signal Detection Theory & Response Times | |||

Satinder Singh:
Reinforcement LearningMDPs/VI, Q learning (w/ proof), TD(lambda), Function approximation, options, PSRs | |||

Michael Littman:
Reinforcement Learning | |||

Jerry (Xiaojin) Zhu:
Semi-Supervised LearningThis tutorial covers classification approaches that utilize both labeled and unlabeled data. We will review self-training, Gaussian mixture models, co-training, multiview learning, graph-transduction and manifold regularization, transductive SVMs, and a PAC bound for semi-supervised learning. We then discuss some new ... | |||

Rémi Gribonval:
Sparse Methods for Under-determined Inverse Problems | |||

Chih-Jen Lin:
Support Vector MachinesSupport vector machines (SVM) and kernel methods are important machine learning techniques. In this short course, we will introduce their basic concepts. We then focus on the training and optimization procedures of SVM. Examples demonstrating the practical use of SVM ... | |||

Robert Nowak, Rui Castro:
Theory, Methods and Applications of Active LearningTraditional approaches to machine learning and statistical inference are passive, in the sense that all data are collected prior to analysis in a non-adaptive fashion. One can envision, however more active strategies in which information gleaned from previously collected data ... | |||

David Blei:
Topic Models | |||

Andrew Blake:
Topics in image and video processing | |||

John Langford:
Tutorial on Machine Learning ReductionsThere are several different classification problems commonly encountered in real world applications such as 'importance weighted classification', 'cost sensitive classification', 'reinforcement learning', 'regression' and others. Many of these problems can be related to each other by simple machines (reductions) that ... | |||

Joshua B. Tenenbaum:
What is cognitive science? |