Proximal Reinforcement Learning:  Learning to Act in Primal Dual Spaces thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Proximal Reinforcement Learning: Learning to Act in Primal Dual Spaces

Published on Jul 28, 20154281 Views

In this talk, we set forth a new framework for reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding fundamental questions t

Related categories

Chapter list

Proximal Reinforcement Learning: Learning to Act in Primal-Dual Spaces00:00
Thanks to my collaborators00:33
Proximal RL Framework01:40
Three Level Analysis04:02
The Key Idea of Proximal RL06:30
Recent publications08:23
Developing a True Stochastic Gradient TD Algorithm: The End of a 30 year Quest?08:54
Stability of RL Algorithms09:20
Instability of TD-Learning11:36
Take the Blue Pill or the Red?12:46
Operator Splitting14:52
Big-O Complexity15:27
Safe Reinforcement Learning with Projected Natural Actor Critic16:00
Natural Actor Critic16:11
Conjugate Functions17:37
Mirror Maps18:34
Mirror Descent = Natural Gradient!19:08
Proximal Mapping generalizes projections19:54
Gradient Descent as proximal mapping21:23
Gradient Descent as proximal operator22:53
Primal Approach to Gradient TD23:56
Details of the Framework - 223:56
Details of the Framework - 123:58
Linear System Reformulation of Gradient TD26:38
Unified Objective for Gradient TD27:15
Saddle point formulation27:37
Lemma28:49
"Gradient" TD Methods29:17
Analysis of gradient TD29:34
What is the “optimal” gradient TD method?29:46
Variational Inequality30:48
Extragradient Method30:53
Extragradient TD-Learning31:15
What is the “optimal” gradient TD method?31:28
Mirror-Prox32:03
Proximal Gradient TD Algorithms32:22
Baird MDP33:04
50 state chain domain33:17
"Safe” Reinforcement Learning33:55
Equivalence of Natural Gradient Descent and Mirror Descent33:58
Safe Robot Learning with PNAC34:47
Proximal Reinforcement Learning35:05
Ongoing Work35:14