Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization

Published on 2013-01-163689 Views

Shai Shalev-Shwartz

Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closel

Optimization for Machine Learning

Related categories

Presentation

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization00:00

Regularized Loss Minimization (1)00:07

Regularized Loss Minimization (2)00:28

Dual Coordinate Ascent (DCA)00:58

SDCA vs. SGD - update rule02:16

SDCA vs. SGD - update rule - Example03:10

SDCA vs. SGD - experimental observations (1)03:33

SDCA vs. SGD - experimental observations (2)04:23

SDCA vs. SGD - Current analysis is unsatisfactory04:45

Dual vs. Primal sub-optimality06:24

Our results08:16

SDCA vs. DCA - Randomization is crucial10:37

Smoothing the hinge-loss (1)11:32

Smoothing the hinge-loss (2)12:15

Smoothing the hinge-loss (3)13:26

Additional related work13:43

Extensions16:33

Proof Idea (1)17:13

Proof Idea (2)17:16

Summary17:17