Regularized Off-Policy TD-Learning

Published on 2013-01-143332 Views

Bo Liu

We present a novel l1 regularized off-policy convergent TD-learning method (termed RO-TD), which is able to learn sparse representations of value functions with low computational complexity. The algor

Knowledge 4 All Foundation Video Journal Volume 3

Related categories

Presentation

Regularized Off-Policy TD-Learning00:00

Problem Setting00:02

Essence Of RO-TD Algorithm01:33

Performance of RO-TD Algorithm02:46