Probabilistic Decision-Making Under Model Uncertainty
Description
Partially Observable Markov Decision Processes offer a rich mathematical framework for decision-making under uncertainty. In recent years, a number of methods have been developed to optimize the choice of action, given a parametric model of the domain. In many applications, however, this model must be learned using a finite set of trajectories. When this data proves difficult or expensive to collect, it is often the case that the resulting model is poorly or imprecisely defined. In this talk, I will present two recent results on the topic of decision-making under model uncertainty. In the first half, I will describe a method for estimating the bias and variance of the value function in terms of the statistics of the empirical transition and observation model. Such error terms can be used to meaningfully compare the value of different policies. In the second half, I will present a bayesian approach designed to simultaneously improve the model and select good actions. Performance of the two methods will be illustrated using problems drawn from the fields of robotics and medical treatment design.
| Slides | |
| 0:00 | - Announcement |
| 0:45 | Probabilistic Decision-Making Under Model Uncertainty |
| 1:39 | Motivation : A human-robot interaction problem |
| 4:07 | Typical ways of solving such problems part1 |
| 4:29 | Typical ways of solving such problems part2 |
| 4:44 | Typical ways of solving such problems part3 |
| 5:27 | Motivation : A treatment design problem |
| 8:15 | What do we need for tackling real-world problems ? |
| 8:52 | Partially Observable Markov Decision Processes part1 |
| 10:52 | Partially Observable Markov Decision Processes part2 |
| 11:04 | Partially Observable Markov Decision Processes part3 |
| 11:13 | Partially Observable Markov Decision Processes part4 |
| 12:40 | Motivation part1 |
| 13:06 | Motivation part2 |
| 13:58 | Motivation part3 |
| 14:26 | Let’s start with a simple case |
| 15:55 | Robot-Human Interaction Example |
| 16:58 | Finite State Controller part1 |
| 18:21 | Finite State Controller part2 |
| 19:17 | Estimating the Variance in the Value Function |
| 19:48 | Model Error |
| 21:15 | Variance in Value Function |
| 22:05 | Error in Value Function Estimate |
| 22:28 | Dialogue Manager part1 |
| 23:33 | Dialogue Manager part2 |
| 23:59 | Dialogue Manager part3 |
| 25:38 | Comparing treatment strategies for chronic illness |
| 29:21 | Discussion |
| 32:57 | Part 2 |
| 33:01 | Bayesian Reinforcement Learning part1 |
| 33:44 | Bayesian Reinforcement Learning part2 |
| 34:52 | Recall the POMDP model definition |
| 35:21 | Bayesian RL in Finite MDPs |
| 37:16 | Bayesian RL in Finite POMDPs |
| 37:45 | Bayes-Adaptive POMDP |
| 38:35 | A few comments |
| 39:23 | Question |
| 39:59 | Belief in BAPOMDPs |
| 41:08 | Theoretical results part1 |
| 42:42 | Theoretical results part2 |
| 43:05 | Finite POMDP Approximation part1 |
| 44:24 | Finite POMDP Approximation part2 |
| 44:35 | Approximate Belief Monitoring part1 |
| 45:01 | Approximate Belief Monitoring part2 |
| 45:45 | Approximate Belief Monitoring part3 |
| 46:02 | Approximate Belief Monitoring part4 |
| 47:47 | Approximation Planning in BAPOMDPs |
| 48:39 | Experimental Results part1 |
| 49:44 | Experimental Results part2 |
| 51:04 | Experimental Results part3 |
| 51:45 | Experimental Results part4 |
| 52:07 | Summary |
| 52:59 | Recent work |
| 54:08 | Conclusion |
| 54:35 | Acknowledgments |
| 54:48 | - Questions |
| 55:27 | - Questions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !



