Models for Trading Exploration and Exploitation using Upper Confidence Bounds
author:
Peter Auer,
University of Leoben
You might be experiencing some problems with Your Video player.
| Slides | |
| 0:00 | Models for Trading |
| 0:39 | Overview |
| 1:25 | The bandit problem with linear side information |
| 3:08 | Goal |
| 4:42 | Relation to other models |
| 6:54 | Results |
| 9:16 | Remarks |
| 9:53 | Algorithm: Using upper confidence bounds |
| 11:28 | Why does this work? |
| 12:58 | Algorithm: Using upper confidence bounds |
| 13:12 | Why does this work? |
| 13:35 | Algorithm: Using upper confidence bounds |
| 15:40 | Why does this work? |
| 16:05 | Bounding the widths of the confidence intervals |
| 17:55 | Random bandit problem (cont.) |
| 20:30 | Random bandit problem (cont.) |
| 22:13 | Random bandit problem: Improved bounds |
| 23:08 | Random bandit problem: Improved bounds |
| 23:20 | Random bandit problem: Improved bounds |
| 23:41 | Random bandit problem: Improved bounds |
| 23:47 | Random bandit problem: Improved bounds |
| 24:10 | Random bandit problem: Improved bounds |
| 24:29 | Random bandit problem: Improved bounds |
| 26:20 | Random bandit problem (cont.) |
| 27:26 | Linear side information |
| 31:25 | Calculating the variance |
| 32:21 | Optimizing the regret |
| 33:20 | Linear side information |
| 33:58 | Optimizing the regret |
| 34:20 | Calculating |
| 35:50 | Optimizing the regret |
| 35:58 | Calculating |
| 37:41 | Reinforcement learning |
| 37:55 | Calculating |
| 40:09 | Reinforcement learning |
| 43:05 | Motivation for online reinforcement learning |
| 44:31 | Discounted and undiscounted returns |
| 46:48 | Regret |
| 47:41 | Episodic reinforcement learning |
| 49:18 | The algorithm: Upper confidence bounds again |
| 53:17 | Why it works (1) |
| 55:07 | Why it works (2) |
| 56:04 | Why it works (3) |
| 59:46 | Conclusion |
| 62:28 | auer-slides_Page_34 |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !



