0.25
0.5
0.75
1.25
1.5
1.75
2
Deep RL
Published on Oct 11, 20181653 Views
Related categories
Chapter list
Deep Reinforcement Learning and the Atari 260000:00
DEEP, REINFORCEMENT LEARNING00:14
Deep RL00:41
Some challenges in deep RL - 101:55
Some challenges in deep RL - 202:52
Some challenges in deep RL - 304:26
Some challenges in deep RL - 405:15
Stella - 105:59
Stella - 207:12
Stella - 308:16
General Competeny09:12
Narrow competency09:55
Diverse, Interesting, Independent11:01
Diverse - 111:21
Diverse - 211:45
Diverse - 313:08
Interesting (to people)13:21
Interesting (by and for people)14:04
Early attempts (2010 - 2013) - 114:56
Early attempts (2010 - 2013) - 216:10
Deep Q-Networks (DQN)17:38
DQN - 119:29
DQN - 220:10
DQN - 321:39
DQN - 422:41
LSTM23:31
Dueling networks (Wang et al., 2016)...26:30
Prioritized replay (Schaul et al., 2016)...27:03
Double Q-learning (van Hasselt et al., 2015)...28:54
1. Distributional reinforcement learning 30:03
1. Distributional reinforcement learning - 133:50
Bellman equation - 134:03
Ground truth, Implied model35:00
Implied model - 135:59
Implied model - 237:11
Implied model - 337:30
Bellman equation - 237:45
Distributional bellman equation 38:01
Value distribution38:51
$15039:05
$450 - 140:37
$450 - 241:30
$30041:44
Discrete distribution41:47
�-greedy w.r.t expected value43:02
Approximation44:25
From x, a, sample a transition - 144:49
From x, a, sample a transition - 245:04
From x, a, sample a transition - 349:36
Mean I Median I > H.B. I > DQN50:20
Seaquest51:35
Time - 152:16
Time - 254:30
Distributional perspective55:02
2. Exploration with pseudo-counts55:41
September 2017 - 156:53
September 2017 - 257:02
September 2017 - 358:04
September 2017 - 458:48
Exploration59:34
Exploration - 201:00:59
Most observations 01:01:36
Generative model01:01:59
Density model - 101:02:15
Density model - 201:02:34
Train01:03:09
The “CTS” model - 101:08:17
The “CTS” model - 201:09:12
periods without salient events01:09:31
Exploration - 301:10:18
Exploration - 401:11:18
Start01:12:00
Average Score01:14:24
Credit assignment issues in exploration01:15:42
Effect of mixed monte carlo update01:16:00
Removing extrinsic rewards - 101:17:04
Removing extrinsic rewards - 201:17:29
Removing extrinsic rewards - 301:18:00
Deep Reinforcement Learning and the Atari 260001:18:42