Lock-Free Approaches to Parallelizing Stochastic Gradient Descent

Published on 2012-01-257828 Views

Benjamin Recht

Stochastic Gradient Descent (SGD) is a very popular optimization algorithm for solving data-driven machine learning problems. SGD is well suited to processing large amounts of data due to its robustne

Optimization for Machine Learning

Related categories

Stochastic Optimization

Presentation

Lock-Free Approaches to Parallelizing Stochastic Gradient Descent00:00

Incremental Gradient Descent00:27

Example: Computing the mean - 0103:23

Example: Computing the mean - 0204:04

Convergence Rates - 0105:28

Convergence Rates - 0206:48

SGD and BIG Data07:37

Is SGD inherently Serial?08:57

HOGWILD!13:23

“Sparse” Function:14:50

Sparse Support Vector Machines17:20

Matrix Completion18:50

Graph Cuts20:34

Convergence Theory21:53

Hogs gone wild!24:35

Speedups26:49

JELLYFISH - 0129:14

JELLYFISH - 0232:19

JELLYFISH - 0332:53

Example Optimization - 0135:04

Example Optimization - 0235:34

Example Optimization - 0337:05

Simplest Problem? Least Squares39:20

Simple Question40:59

What about generically? - 0142:50

What about generically? - 0242:56

What about generically? - 0343:58

But what about on average?46:40

Summary48:59