Optimization Algorithms in Machine Learning

author: Stephen J. Wright, Computer Sciences Department, University of Wisconsin - Madison
published: Jan. 12, 2011,   recorded: December 2010,   views: 5778
Categories
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Optimization Algorithms in Machine Learning
1:16 Optimization
2:49 Topics
3:57 I. First-Order Methods
5:08 What's the Setup?
5:52 Gradient
7:51 Constant (Short) Steplength
8:15 Gradient
8:30 Constant (Short) Steplength
10:29 The 1/k^2 Speed Limit
11:38 Exact minimizing: Faster rate?
12:57 Multistep Methods: Heavy-Ball (1)
14:59 Multistep Methods: Heavy-Ball (2)
16:00 Summary: Linear Convergence, Strictly Convex f
16:39 Conjugate Gradient
19:05 Accelerated First-Order Methods (1)
21:07 Accelerated First-Order Methods (2)
21:29 Accelerated First-Order Methods (1)
22:00 Accelerated First-Order Methods (2)
22:01 Convergence Results: Nesterov
22:41 Accelerated First-Order Methods (2)
22:43 Accelerated First-Order Methods (1)
22:50 Convergence Results: Nesterov
22:55 A Non-Monotone Gradient Method: Barzilai-Borwein
24:35 Comparison: BB vs Greedy Steepest Descent
26:23 Many BB Variants
27:02 Primal-Dual Averaging
28:26 Convergence Properties
29:23 Extending to the Constrained Case
30:25 Example: Nesterov's Constant Step Scheme
31:08 Regularized Optimization
33:07 Further Reading (1)
33:47 II. Stochastic and Incremental Gradient Methods
34:52 Applications
35:42 Subgradients (1)
35:53 Subgradients (2)
36:22 Subgradients (1)
36:58 "Classical" Stochastic Approximation
37:59 Rate: 1/k (1)
39:49 Rate: 1/k (2)
40:00 Rate: 1/k (1)
40:05 Rate: 1/k (2)
41:37 But... What if we don't know...
42:09 Robust SA
43:17 Analysis of Robust SA (1)
44:34 Analysis of Robust SA (2)
45:31 Robust SA
45:38 Analysis of Robust SA (2)
45:40 Analysis of Robust SA (3)
46:13 Mirror Descent (1)
47:13 Mirror Descent (2)
48:20 Bregman distance
48:31 Mirror Descent (2)
48:42 Mirror Descent (1)
48:50 Mirror Descent (2)
49:02 Bregman Distances: Examples
49:56 Incremental Gradient
51:42 Achievable Accuracy
52:40 Applications to SVM
53:54 Further Reading (2)

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

 Watch videos:   (click on thumbnail to launch)

Watch Part 1
Part 1 54:11
!NOW PLAYING
Watch Part 2
Part 2 55:22

Description

Optimization provides a valuable framework for thinking about, formulating, and solving many problems in machine learning. Since specialized techniques for the quadratic programming problem arising in support vector classification were developed in the 1990s, there has been more and more cross-fertilization between optimization and machine learning, with the large size and computational demands of machine learning applications driving much recent algorithmic research in optimization. This tutorial reviews the major computational paradigms in machine learning that are amenable to optimization algorithms, then discusses the algorithmic tools that are being brought to bear on such applications. We focus particularly on such algorithmic tools of recent interest as stochastic and incremental gradient methods, online optimization, augmented Lagrangian methods, and the various tools that have been applied recently in sparse and regularized optimization.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: