Trade-Offs in Sampling-Based Adversarial Planning

Published on 2011-07-213710 Views

Raghuram Ramanujan

The Upper Confidence bounds for Trees (UCT) algorithm has in recent years captured the attention of the planning and game-playing community due to its notable success in the game of Go. However, attem

ICAPS 2011 - Freiburg

Related categories

Planning and Scheduling

Presentation

Trade-offs in Sampling-based Adversarial Planning00:00

Upper Confidence bounds for Trees (UCT)00:11

Understanding UCT00:42

The Multi-Armed Bandit Problem - 201:26

The UCB1 Bandit Algorithm - 101:31

The UCB1 Bandit Algorithm - 202:59

From Bandits to Tree Search03:45

The UCT Algorithm - 104:21

The UCT Algorithm - 205:47

UCT in Action06:26

Minimax in Action07:17

UCT versus Minimax - 107:42

UCT versus Minimax - 208:30

Mancala09:04

UCT in Mancala09:51

Complete versus Selective Search - 111:56

Complete versus Selective Search - 212:23

Other Trade-offs in UCT14:08

UCTMAXh versus Minimax15:40

Background: Trap States - 116:13

Background: Trap States - 216:52

Traps in Mancala17:07

‘Partial’ Games of Mancala17:39

UCTMAXh versus Minimax - 318:05

Conclusions19:24