en
0.25
0.5
0.75
1.25
1.5
1.75
2
The Fixed Points of Off-Policy TD
Published on Sep 06, 20122828 Views
Off-policy learning, the ability for an agent to learn about a policy other than the one it is following, is a key element of Reinforcement Learning, and in recent years there has been much work on de
Related categories
Chapter list
The Fixed Points of Off-Policy TD00:00
Can be solved, in principle, by Temporal Difference learning00:32
This work is about fixing off-policy TD01:37
Guarantees on resulting solution quality02:47