Markov Decision Processes with Ordinal Rewards: Reference Point-Based Preferences thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Markov Decision Processes with Ordinal Rewards: Reference Point-Based Preferences

Published on Jul 21, 20113585 Views

In a standard Markov decision process (MDP), rewards are assumed to be precisely known and of quantitative nature. This can be a too strong hypothesis in some situations. When rewards can really be

Related categories

Chapter list

Markov Decision Processes with Ordinal Rewards: Reference Point-Based Preferences00:00
Sequential Decision Making under Uncertainty (1)00:55
Sequential Decision Making under Uncertainty (2)01:37
Value Functions and Solution Methods (1)02:18
Value Functions and Solution Methods (2)02:23
Value Functions and Solution Methods (3)02:25
Value Functions and Solution Methods (4)02:45
Optimal Policies Depend on the Reward Function... (1)02:52
Optimal Policies Depend on the Reward Function... (2)03:46
Optimal Policies Depend on the Reward Function... (3)04:04
Optimal Policies Depend on the Reward Function... (4)04:36
Difficulty of Defining the Reward Function (1)05:11
Difficulty of Defining the Reward Function (2)05:36
Difficulty of Defining the Reward Function (3)06:53
Towards Preference over vectors (1)07:38
Towards Preference over vectors (2)08:16
Towards Preference over vectors (3)08:26
Assumptions for a Numeric Reward Functions (1)08:47
Assumptions for a Numeric Reward Functions (2)09:41
Assumptions for a Numeric Reward Functions (3)10:13
Assumptions for a Numeric Reward Functions (4)10:54
Assumptions for Reference Point-Based Preferences (1)12:06
Assumptions for Reference Point-Based Preferences (2)12:31
Assumptions for Reference Point-Based Preferences (3)12:51
Interpretation (1)13:37
Interpretation (2)13:40
Interpretation (3)15:15
Interpretation (4)15:32
Vade Mecum16:11
Reference-Point Based Preferences in Standard MDPs: One-Shot Decision (1)17:33
Reference-Point Based Preferences in Standard MDPs: One-Shot Decision (2)18:05
Reference-Point Based Preferences in Standard MDPs: One-Shot Decision (3)18:14
Reference-Point Based Preferences in Standard MDPs: One-Shot Decision (4)18:58
Reference-Point Based Preferences in Standard MDPs: One-Shot Decision (5)19:01
Reference-Point Based Preferences in Standard MDPs: One-Shot Decision (6)19:02
Reference-Point Based Preferences in Standard MDPs: One-Shot Decision (7)19:02
Reference-Point Based Preferences in Standard MDPs: One-Shot Decision (8)19:02
Conclusion and Future Work19:03