en-de
en-es
en-fr
en-pt
en-sl
en
en-zh
0.25
0.5
0.75
1.25
1.5
1.75
2
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
Published on Aug 02, 20113364 Views
We consider a Kullback-Leibler-based algorithm for the stochastic multi-armed bandit problem in the case of distributions with finite supports (not necessarily known beforehand), whose asymptotic reg
Related categories
Chapter list
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences00:00
Multi-Armed Bandit setting00:09
The stochastic Multi-Armed Bandit setting00:18
Measure of performance for a Multi-Armed Bandit01:01
Historical overview (I): First step01:45
Historical overview (II): Extension02:49
Intuition: from K to Kinf04:06
Historical overview (III): Non-asymptotic05:26
Historical overview (III): General asymptotic06:22
Overview07:05
Distributions with nite support07:32
Algorithm: the Kinf-strategy07:32
Non-asymptotic upper-bound for the Kinf-strategy08:16
One-slide summary08:58
The explicit case of Bernoulli distributions (I)09:50
The explicit case of Bernoulli distributions (II)10:30
Concentration tools - 111:09
Concentration tools - 212:25
Intuition: Information complexity of sub-optimal arms - 113:32
Intuition: Information complexity of sub-optimal arms - 214:34
Intuition: Information complexity of sub-optimal arms - 314:42
Intuition: Information complexity of sub-optimal arms - 414:47
Intuition: Information complexity of sub-optimal arms - 515:03
Intuition: Information complexity of sub-optimal arms - 615:05
Intuition: Information complexity of sub-optimal arms - 715:19
Intuition: Information complexity of sub-optimal arms - 815:30
Intuition: Information complexity of sub-optimal arms - 916:02
Conclusion and Future work16:16
Köszönöm!17:37