Theory of RL	 thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Theory of RL

Published on Jul 27, 20174855 Views

Related categories

Chapter list

Theory of reinforcement learning00:00
Theory of reinforcement learning - 100:35
Contents00:58
What we will not cover02:28
What and why?03:36
What is a “theory” (for us)?04:41
Who do you want to be?06:43
I won’t do theory. Should I care?08:58
I won’t do theory. Should I care? - 110:55
I won’t do theory. Should I care? - 212:25
Statistical learning theory: ingredients13:01
What to predict?23:18
A priori analysis24:20
Two fundamental results in SLT25:09
The fundamental theorem of SLT26:06
Computational complexity 28:02
Batch learning33:44
Batch RL: The learning problem33:51
Batch RL and supervised learning37:39
Batch RL with nontrivial horizons39:22
A “generic” recipe for positive result48:49
when you have a simulator49:33
Planning problem50:41
Working with large MDPs52:22
Fitted Value Iteration53:39
New problem: Instability56:21
Disaster strikes57:41
and with neural nets58:38
Conclusions58:43
Pushing it harder01:00:28
From FVI to DQN01:02:44
From FVI to DQN01:03:04
Map of planning methods01:05:48
no simulator, no pain..? Uh..no01:06:13
Defining online learning01:06:35
Why should you care?01:08:20
The challenge01:10:02
Warmup: Bandits/terminology01:14:09
The key result on (stochastic) bandits01:15:37
Optimism in the face of uncertainty01:16:56
An instance-dependent result01:17:08
An instance-independent result01:17:41
How about MDPs?01:17:53
Frontiers01:20:34
Conclusions/summary01:20:59
Conclusions/summary - 101:22:37
Questions01:23:32