Dyna(k): A Multi-Step Dyna Planning
Description
Dyna planning is an efficient way of learning
from real and imaginary experience. Existing
tabular and linear Dyna algorithms are
single-step, because an "imaginary" feature
is predicted only one step into the future. In
this paper, we introduce a multi-step Dyna
planning that predicts more steps into the
future. Multi-step Dyna is able to figure out
a sequence of multi-step results when a real
instance happens, given that the instance itself,
or a similar experience has been imagined
(i.e., simulated from the model) and
planned. Our multi-step Dyna is based on a
multi-step model, which we call the λ-model.
The λ-model interpolates between the onestep
model and an innite-step model, and
can be learned efficiently online. The multistep
Dyna algorithm, Dyna(k), uses the λ-
model to generate predictions k steps ahead
of the imagined feature, and applies TD on
this imaginary multi-step transitioning.
| Slides | |
| 0:00 | Multi-step Dyna-style Planning |
| 0:11 | Outline |
| 0:38 | Planning |
| 1:09 | Dyna Planning Architecture |
| 2:16 | Multi-step Dyna planning? |
| 2:47 | Multi-step Models |
| 3:36 | Single-step/Multi-step Models |
| 4:37 | Single-step Models |
| 5:40 | Multi-step Linear Model |
| 6:58 | k-step Dyna Planning |
| 8:13 | Results: Boyan Chain |
| 9:53 | Conclusion |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




