Regularized Off-Policy TD-Learning thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Regularized Off-Policy TD-Learning

Published on Jan 14, 20133306 Views

We present a novel l1 regularized off-policy convergent TD-learning method (termed RO-TD), which is able to learn sparse representations of value functions with low computational complexity. The algor

Related categories

Chapter list

Regularized Off-Policy TD-Learning00:00
Problem Setting00:02
Essence Of RO-TD Algorithm01:33
Performance of RO-TD Algorithm02:46