Preconditioned Temporal Difference Learning
published: Aug. 12, 2008, recorded: July 2008, views: 74
Slides
Related content
05:47:38
865 views - Csaba Szepesvari, 2008
26:11
130 views - Ronald Parr, 2008
13:58
16 views - Hengshuai Yao, 2009
04:58:57
1387 views - Satinder Singh, 2006
25:11
122 views - Arkady Epshteyn, 2008
19:04
138 views - Francisco Melo, 2008
22:01
46 views - Takaki Makino, 2008
19:59
244 views - Finale Doshi, 2008
19:38
1009 views - Mohammad Ghavamzadeh, 2007
23:11
52 views - J. Zico Kolter, 2009
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Description
This paper extends many of the recent popular reinforcement learning (RL) algorithms to a generalized framework that includes least-squares temporal difference (LSTD) learning, least-squares policy evaluation (LSPE) and a variant of incremental LSTD (iLSTD). The basis of this extension is a preconditioning technique that tries to solve a stochastic model equation. This paper also studies three signicant issues of the new framework: it presents a new rule of step-size that can be computed online, provides an iterative way to apply preconditioning, and reduces the complexity of related algorithms to near that of temporal difference (TD) learning.
See Also:
Download slides:
icml08_yao_ptd_01.pdf (544.0 KB)
Launch in a standalone WM Player
Switch to Windows Media Player
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




Write your own review or comment: