Optimization II thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Optimization II

Published on Oct 11, 20181595 Views

Related categories

Chapter list

Optimization: Part II 00:00
Many thanks to 01:01
Different perspectives on optimization01:10
Different perspectives on nonlinear optimization 02:30
Deterministic and Stochastic Optimization05:19
Stochastic and Deterministic Large-Scale Nonlinear Optimization Worlds 07:44
To make this concrete let’s talk about Momentum and Acceleration 12:11
Momentum (Heavy Ball Method) 12:19
Consider what momentum can do in the non-convex case 14:00
But momentum works in practice14:48
Nesterov acceleration17:58
Acceleration with noisy gradients 19:11
Understanding SGD 20:17
Convergence 20:30
Fixed steplength, Diminishing steplength 23:46
Efficiency of SGD 26:17
Non-convexity and SGD28:44
Weaknesses of SGD? 30:30
Optimization panorama32:41
SGD32:53
Three approaches for constructing second order information 34:00
Mini-Batches 35:27
The trade-offs of larger batch sizes 36:55
Robust minimizers 39:22
Progressive sampling gradient method - 140:36
Progressive sampling gradient method - 242:00
How to use progressive sampling in practice? 43:11
Two strategies 43:52
Implementation via sample variances - 144:22
Implementation via sample variances - 244:50
On the Steplengths 45:27
Scaling the Search Direction45:51
Scaling the Gradient Direction 46:41
Different gradient components should be scaled differently 49:25
Newton’s method 50:30
A Fundamental Equation for Newton’s method 53:00
Inexact Newton Method56:05
Sub-sampled Hessian Newton Methods57:46