Sample-Based Planning for Continuous Action Markov Decision Processes

author: Chris Mansley, Department of Computer Science, Rutgers, The State University of New Jersey
published: July 21, 2011,   recorded: June 2011,   views: 4171
Categories

Slides

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Bibliography

Description

In this paper, we present a new algorithm that integrates recent advances in solving continuous bandit problems with sample-based rollout methods for planning in Markov Decision Processes (MDPs). Our algorithm, Hierarchical Optimistic Optimization applied to Trees (HOOT) addresses planning in continuous-action MDPs. Empirical results are given that show that the performance of our algorithm meets or exceeds that of a similar discrete action planner by eliminating the problem of manual discretization of the action space.

See Also:

Download slides icon Download slides: icaps2011_mansley_markov_01.pdf (1.3┬áMB)


Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Thanos, November 26, 2018 at 6:58 a.m.:

Must be like to visit the nice site here https://playfreecellonline.net and play the most amazing freecell online solitaire card game there are the great fun to all players so join hurry thanks.


Comment2 Web Designer in NYC, November 7, 2019 at 12:42 p.m.:

This is a great inspiring article.I am pretty much pleased with your good work. You put really very helpful information. Keep it up. Keep blogging

Write your own review or comment:

make sure you have javascript enabled or clear this field: