The Fixed Points of Off-Policy TD thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

The Fixed Points of Off-Policy TD

Published on Sep 06, 20122827 Views

Off-policy learning, the ability for an agent to learn about a policy other than the one it is following, is a key element of Reinforcement Learning, and in recent years there has been much work on de

Related categories

Chapter list

The Fixed Points of Off-Policy TD00:00
Can be solved, in principle, by Temporal Difference learning00:32
This work is about fixing off-policy TD01:37
Guarantees on resulting solution quality02:47