lil’ UCB: An Optimal Exploration Algorithm for Multi-Armed Bandits thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

lil’ UCB: An Optimal Exploration Algorithm for Multi-Armed Bandits

Published on Jul 15, 20142151 Views

The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of tot

Related categories