Counterfactual Risk Minimization: Learning from Logged Bandit Feedback

Published on 2015-12-051694 Views

Adith Swaminathan

We develop a learning principle and an efficient algorithm for batch learning from logged bandit feedback. This learning setting is ubiquitous in online systems (e.g., ad placement, web search, recomm

ICML 2015 - Lille

Related categories

Counterfactual Risk Minimization: Learning from Logged Bandit Feedback

Adith Swaminathan

ICML 2015 - Lille

Related categories

Presentation