Toward the understanding of partial-monitoring games

Published on 2011-07-254513 Views

Csaba Szepesvári

Partial monitoring games form a common ground for problems such as learning with expert advice, the multi-armed bandit problem, dynamic pricing, the dark pool problem, label efficient prediction, chan

On‐lineTrading of Exploration and Exploitation 2011 - Washington

Related categories

Presentation

Toward the understanding of partial-monitoring games00:00

Outline01:04

Outline: Partial Monitoring (what and why?) (1)01:30

Prediction Games01:32

A Mathematical Framework03:11

Measure of Performance of the Learner04:06

Outline: Partial Monitoring (what and why?) (2)06:06

What is Missing?06:16

Partial-Monitoring Games11:31

Mathematical Framework12:40

Examples I13:44

Examples II13:53

Example IV: Multi-Armed Bandits16:35

Example V: Dynamic Pricing17:30

Previous Results about Finite Games (1)17:40

Previous Results about Finite Games (2)21:17

Our Contributions22:49

Outline: Results (1)24:18

What Information Can We Collect?24:24

What is the Information Good For?27:22

Outline: Results (2)29:29

The Fundamental Lemma29:31

Proof of the Fundamental Lemma30:10

Proof31:37

The Fundamental Lemma - Illustration31:47

Cell decomposition33:50

Outline: Results (3)34:53

Lower Bound34:56

Lower Bound - Illustration35:37

Outline: Results (4)35:51

Toward the Upper Bound35:51

Constructing Good Estimates37:38

When Should the Learner Stop Using an Action?39:43

Finishing the Upper Bound40:18

Algorithm – BALATON40:47

The Main Result41:40

Summary for Finite, Stochastic Games41:44

Conclusions42:23

For Further Reading43:39