Using upper confidence bounds to control exploration and exploitation
author:
Csaba Szepesvari,
Department of Computing Science, University of Alberta
You might be experiencing some problems with Your Video player.
| Slides | |
| 0:01 | Using upper confidence bounds to control exploration and exploitation |
| 0:09 | Contents |
| 1:15 | Exploration vs. Exploitation |
| 2:08 | Exploration vs. Exploitation: Some Applications |
| 3:06 | Bandit Problems – “Optimism in the Face Uncertainty” |
| 5:32 | Parametric Bandits [Lai&Robbins] |
| 7:16 | Bounds |
| 8:26 | UCB1 Algorithm (Auer et al., 2002) |
| 10:41 | TITLE |
| 11:02 | Bandits in Continuous Time |
| 13:00 | Formal framework |
| 14:24 | Evaluating allocation rules (policies) |
| 16:19 | Gain, action values and regret |
| 19:11 | Model-based UCB |
| 22:07 | Algorithm |
| 24:52 | Regret bound |
| 28:37 | Key proposition |
| 29:40 | Open problems |
| 31:06 | Levente Kocsis Remi Munos |
| 31:23 | Bandits with large action-spaces |
| 31:46 | Structure helps! |
| 32:51 | UCT Upper Confidence based Tree search |
| 33:13 | Example (t=1) |
| 34:18 | Example (t=2) |
| 34:45 | Example (t=3) |
| 35:04 | Example (t=4) |
| 35:17 | What is the next time a suboptimal action is sampled? |
| 36:01 | UCT variations |
| 37:29 | UCT variations |
| 37:53 | Theoretical results |
| 39:16 | Planning in MDPs: Sailing |
| 39:24 | Planning in MDPs: Sailing |
| 39:32 | Planning in MDPs: Sailing |
| 40:22 | Results in games |
| 41:02 | Thank you! |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
Visitors who watched this lecture also watched...
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




