Online Reinforcement Learning from Concurrent Customer Interaction Sequences
published: May 28, 2013, recorded: September 2012, views: 4171
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
This talk explores applications in which a company interacts with many customers. The company has an objective function, such as maximising revenue, customer satisfaction, or customer loyalty, which depends primarily on the sequence of interactions between company and customer. A key aspect ofthis setting is that interactions with different customers occur asynchronously and in parallel. As a result, it is imperative to learn online from partial interaction sequences, so that information acquired from one customer is efficiently assimilated and applied in subsequent interactions with other customers. I will present the first framework for reinforcement learning in this setting, using an asynchronous variant of temporal-difference learning to learn efficiently from partial interaction sequences.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !