Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem

Published on Aug 20, 20151616 Views

We study the $K$-armed dueling bandit problem, a variation of the standard stochastic bandit problem where the feedback is limited to relative comparisons of a pair of arms. We introduce a tight asymp

Related categories

Chapter list

regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem00:00
Motivation: perferencce elicitation00:08
Sushi - 100:19
Sushi - 200:25
Sushi - 300:31
Sushi - 400:38
Dueling bandit problem00:46
Condorcet assumption01:23
Main result 1: Regret lower bound01:57
Main result 2: The RMED algorithms02:27
Mai routine of RMD2FH03:04
Numerical Experiment03:40
Summary03:53