Beyond Stochastic Gradient Descent

Published on 2013-08-267157 Views

Francis R. Bach

Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are man

ROKS 2013 - Leuven

Related categories

Presentation

Beyond stochastic gradient descent00:00

Context00:29

Outline (1)03:04

Supervised machine learning04:25

Smoothness and strong convexity (1)06:39

Smoothness and strong convexity (2)07:28

Smoothness and strong convexity (3)07:40

Smoothness and strong convexity (4)08:41

Iterative methods for minimizing smooth functions10:14

Stochastic approximation12:24

Convex stochastic approximation14:28

Convex stochastic approximation-existing work (1)15:55

Convex stochastic approximation-existing work (2)17:23

Adaptive algorithm for logistic regression (1)19:05

Adaptive algorithm for logistic regression (2)20:16

Least-mean-square algorithm23:02

Markov chain interpretation of constant step sizes25:08

Simulations - synthetic examples (1)26:59

Simulations - benchmarks (1)28:49

Beyond last-squares-Markov chain interpretation30:17

Simulations - synthetic examples (2)31:38

Restoring convergence through online Newton steps33:02

Choice of support point for online Newton steps35:23

Simulations - synthetic examples (3)37:17

Simulations - benchmarks (2)38:21

Outline (2)39:14

Going beyond a single pass over the data39:37

Stochastic vs. deterministic methods (1)41:06

Stochastic vs. deterministic methods (2)41:15

Stochastic vs. deterministic methods (3)41:18

Stochastic vs. deterministic methods (4)41:44

Stochastic average gradient (1)41:52

Stochastic average gradient (2)43:51

Conclusions44:14