Machine Learning

Published on 2017-07-2736199 Views

Doina Precup

DLSS & RLSS 2017 - Montreal

Related categories

Presentation

Introduction to Machine Learning00:00

Outline00:16

Types of machine learning problems00:35

Supervised learning00:46

Example: Face detection and recognition02:02

Reinforcement learning02:23

Example: TD-Gammon (Tesauro, 1990-1995)02:54

Unsupervised learning04:04

Example: Oncology (Alizadeh et al.)04:21

Example: A data set05:06

Data (continued)05:41

Terminology06:21

More formally06:41

Supervised learning problem07:01

Example: What hypothesis class should we pick?08:38

Linear hypothesis09:10

Error minimization!09:51

Least mean squares (LMS)10:00

Steps to solving a supervised learning problem10:37

Notation reminder10:40

A bit of algebra11:10

The solution11:38

Example: Data and best linear hypothesis12:19

Linear regression summary12:23

Linear function approximation in general12:26

Linear models in general13:39

Remarks14:23

Order-3 fit16:18

Order-4 fit16:22

Order-5 fit16:24

Order-6 fit16:26

Order-7 fit16:29

Order-9 fit16:39

Order-8 fit17:18

Order-2 fit17:42

Overfitting18:15

Overfitting and underfitting18:23

Overfitting more formally20:40

Typical overfitting plot21:59

Cross-validation23:51

The anatomy of the error of an estimator24:52

Bias-variance analysis26:10

Recall: Statistics 10127:13

The variance lemma27:33

Error decomposition30:33

Bias-variance decomposition (2)32:03

Bias-variance decomposition32:27

Bias-variance trade-off33:38

More on overfitting34:16

Coming back to mean-squared error function...35:21

A probabilistic assumption35:55

Bayes theorem in learning36:35

Choosing hypotheses37:59

Maximum likelihood estimation39:02

The log trick40:30

Maximum likelihood for regression40:48

Applying the log trick41:07

Maximum likelihood hypothesis for least-squares estimators44:23

A graphical representation for the data generation44:31

Regularization47:02

Regularization for linear models48:00

What L2 regularization does50:45

Visualizing regularization (2 parameters)51:02

Pros and cons of L2 regularization53:01

L1 Regularization for linear models53:53

Solving L1 regularization54:56

Visualizing L1 regularization55:27

Pros and cons of L1 regularization55:29

Example of L1 vs L2 effect56:07

Bayesian view of regularization - 156:50

Bayesian view of regularization - 259:05

What does the Bayesian view give us? - 101:02:36

What does the Bayesian view give us? - 201:03:46

What does the Bayesian view give us? - 301:04:44

Logistic regression01:13:26

The cross-entropy error function01:15:53

Cross-entropy error surface for logistic function01:17:22

Gradient descent01:18:03

Example gradient descent traces01:18:46

Gradient descent algorithm01:19:13

Maximization procedure: Gradient ascent01:20:18

Another algorithm for optimization01:21:15

Application to machine learning01:21:51

Second-order methods: Multivariate setting01:22:12

Which method is better?01:23:11

Newton-Raphson for logistic regression01:23:47

Regularization for logistic regression01:24:09

Probabilistic view of logistic regression01:25:10

Recap01:25:17