Resourceful Contextual Bandits
published: July 15, 2014, recorded: June 2014, views: 2403
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, Dudik et al. (2011)) and bandits with resource constraints (bandits with knapsacks, Badanidiyuru et al. (2013a)), and prove a regret guarantee with near-optimal statistical properties.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !