Dyna(k): A Multi-Step Dyna Planning
published: Aug. 26, 2009, recorded: June 2009, views: 16
Slides
Related content
24:32
74 views - Hengshuai Yao, 2008
56:20
153 views - Richard S. Sutton, 2009
01:00:56
49 views - Gerald Tesauro, 2009
03:59:04
587 views - Nicol Schraudolph, 2005
02:17:49
329 views - Yael Niv, 2009
01:14:04
30 views - Stuart Russell, 2009
05:47:38
866 views - Csaba Szepesvari, 2008
04:38
3394 views - Davor Orlič, Fei-Fei Li, 2006
05:22:53
8316 views - Nando de Freitas, 2008
10:40
54 views - David Wingate, 2009
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Description
Dyna planning is an efficient way of learning from real and imaginary experience. Existing tabular and linear Dyna algorithms are single-step, because an "imaginary" feature is predicted only one step into the future. In this paper, we introduce a multi-step Dyna planning that predicts more steps into the future. Multi-step Dyna is able to figure out a sequence of multi-step results when a real instance happens, given that the instance itself, or a similar experience has been imagined (i.e., simulated from the model) and planned. Our multi-step Dyna is based on a multi-step model, which we call the λ-model. The λ-model interpolates between the onestep model and an innite-step model, and can be learned efficiently online. The multistep Dyna algorithm, Dyna(k), uses the λ- model to generate predictions k steps ahead of the imagined feature, and applies TD on this imaginary multi-step transitioning.
See Also:
Download slides:
icml09_yao_dmsdp_01.pdf (2.7 MB)
Launch in a standalone WM Player
Switch to Windows Media Player
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




Write your own review or comment: