Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem

Published on 2015-08-201630 Views

Junpei Komiyama

We study the $K$-armed dueling bandit problem, a variation of the standard stochastic bandit problem where the feedback is limited to relative comparisons of a pair of arms. We introduce a tight asymp

COLT 2015 - Paris

Related categories

Presentation

regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem00:00

Motivation: perferencce elicitation00:08

Sushi - 100:19

Sushi - 200:25

Sushi - 300:31

Sushi - 400:38

Dueling bandit problem00:46

Condorcet assumption01:23

Main result 1: Regret lower bound01:57

Main result 2: The RMED algorithms02:27

Mai routine of RMD2FH03:04

Numerical Experiment03:40

Summary03:53