en
0.25
0.5
0.75
1.25
1.5
1.75
2
Tradeoffs in online learning under partial information feedback
Published on Jan 16, 20132828 Views
How should an online learner choose its actions to trade off between exploration and exploitation to maximize the accuracy of predictions where the choice of actions directly influence what informatio
Related categories
Chapter list
Tradeoffs in online learning under partial information feedback00:00
Collaborators00:04
Contents00:25
Partial monitoring - an example00:48
The tradeoff - 104:29
Characterization of the tradeoff05:17
Theorem12:11
Back to dynamic pricing14:15
An adaptive strategy16:59
Adaptive control of the tradeoff!18:43
Prediction with side information18:57
The regret20:31
The algorithm CBP-Side21:33
Assumption:23:51
Result for logistic regression27:30
Open problems - 127:47
Online probing28:53
Regret30:47
The tradeoff - 232:22
Free labels!32:39
Finite competitor set F, Lipschitz losses33:08
Lipschitz losses: Covering arguments36:43
Adding structure: Linear regression with quadratic losses37:35
Regret for regression40:53
When the label is costly..43:27
The tradeoff - 345:06
Open problems - 245:52
Distributed bandit optimization46:12
Simplified model47:21
The tradeoff - 448:16
Previous work48:38
Our results49:20
Method for the adversarial setting - 149:58
Method for the adversarial setting - 251:11
Method for the adversarial setting - 351:54
Open questions52:23
Conclusions53:40