Autonomous Exploration in Reinforcement Learning thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Autonomous Exploration in Reinforcement Learning

Published on Jan 25, 20124414 Views

One of the striking differences between current reinforcement learning algorithms and early human learning is that animals and infants appear to explore their environments with autonomous purpose, in

Related categories

Chapter list

Autonomous Exploration in Reinforcement Learning00:00
Motivation00:18
Evaluation of autonomous exploration algorithms - 0101:54
Evaluation of autonomous exploration algorithms - 0202:16
Evaluation of autonomous exploration algorithms - 0303:24
Evaluation of autonomous exploration algorithms - 0405:55
Learning to navigate06:48
Reaching all states that are reachable in L steps - 0109:15
Reaching all states that are reachable in L steps - 0209:19
Excluding unreachable states - 0110:19
Excluding unreachable states - 0211:02
Excluding unreachable states (counterexample)11:10
Excluding intermediate states - 0113:53
Excluding intermediate states - 0214:44
Reinforcement learning - 0118:54
Reinforcement learning - 0219:14
Discounted and undiscounted rewards - 0119:29
Discounted and undiscounted rewards - 0220:00
PAC-MDP bounds for discounted rewards - 0120:42
PAC-MDP bounds for discounted rewards - 0221:33
Regret bounds21:40
PAC-MDP bounds from regret bounds23:34
Optimistic policies for regret bounds in RL - 0124:55
Optimistic policies for regret bounds in RL - 0226:00
Intuition about optimistic policies26:28
Consistent MDPs - 0127:07
Consistent MDPs - 0228:10
Main quantities in the proof - 0129:22
Main quantities in the proof - 0230:55
Summing over episodes (discounted UCRL)32:15
Optimistic algorithm for autonomous exploration32:18
Analysis (1)33:52
Analysis (2)35:00
Analysis (3): Consistent MDPs36:03
Summary36:05
(Why) is autonomous exploration useful?37:26