A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences

Published on Aug 02, 20113365 Views

We consider a Kullback-Leibler-based algorithm for the stochastic multi-armed bandit problem in the case of distributions with finite supports (not necessarily known beforehand), whose asymptotic reg

Related categories

Chapter list

A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences00:00
Multi-Armed Bandit setting00:09
The stochastic Multi-Armed Bandit setting00:18
Measure of performance for a Multi-Armed Bandit01:01
Historical overview (I): First step01:45
Historical overview (II): Extension02:49
Intuition: from K to Kinf04:06
Historical overview (III): Non-asymptotic05:26
Historical overview (III): General asymptotic06:22
Overview07:05
Distributions with nite support07:32
Algorithm: the Kinf-strategy07:32
Non-asymptotic upper-bound for the Kinf-strategy08:16
One-slide summary08:58
The explicit case of Bernoulli distributions (I)09:50
The explicit case of Bernoulli distributions (II)10:30
Concentration tools - 111:09
Concentration tools - 212:25
Intuition: Information complexity of sub-optimal arms - 113:32
Intuition: Information complexity of sub-optimal arms - 214:34
Intuition: Information complexity of sub-optimal arms - 314:42
Intuition: Information complexity of sub-optimal arms - 414:47
Intuition: Information complexity of sub-optimal arms - 515:03
Intuition: Information complexity of sub-optimal arms - 615:05
Intuition: Information complexity of sub-optimal arms - 715:19
Intuition: Information complexity of sub-optimal arms - 815:30
Intuition: Information complexity of sub-optimal arms - 916:02
Conclusion and Future work16:16
Köszönöm!17:37