en-de
en-es
en-fr
en-pt
en-sl
en
en-zh
0.25
0.5
0.75
1.25
1.5
1.75
2
A simple multi-armed bandit algorithm with optimal variation-bounded regret
Published on Aug 02, 20114671 Views
We pose the question of whether it is possible to design a simple, linear-time algorithm for the basic multi-armed bandit problem in the adversarial setting which has a regret bound of O(√QlogT), wh
Related categories
Chapter list
A simple MAB algorithm with optimal variation bounded regret00:00
Multi-Armed Bandits00:20
State Of the art01:14
Can we be more optimal?03:14
Known Variational bounds/bandits04:45
The Question06:14