video thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

The optimistic principle for online planning in Markov decision processes

Published on 2013-05-282645 Views

Given an initial state, what is the best possible action that can be returned by a planning algorithm that is given a finite numerical budget (e.g. number of calls to a model of the state-transition

Related categories