Ensemble Monte-Carlo Planning: An Empirical Study
published: July 21, 2011, recorded: June 2011, views: 4280
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Monte-Carlo planning algorithms, such as UCT, select actions at each decision epoch by intelligently expanding a single search tree given the available time and then selecting the best root action. Recent work has provided evidence that it can be advantageous to instead construct an ensemble of search trees and to make a decision according to a weighted vote. However, these prior investigations have only considered the application domains of Go and Solitaire and were limited in the scope of ensemble configurations considered. In this paper, we conduct a more exhaustive empirical study of ensemble Monte-Carlo planning using the UCT algorithm in a set of six additional domains. In particular, we evaluate the advantages of a broad set of ensemble configurations in terms of space and time efficiency in both parallel and single core models. Our results demonstrate that ensembles are an effective way to improve performance per unit time given a parallel time model and performance per unit space in a single-core model. However, contrary to prior isolated observations, we did not find significant evidence that ensembles improve performance per unit time in a single-core model.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !