0.25
0.5
0.75
1.25
1.5
1.75
2
Thompson Sampling for Learning Parameterized Markov Decision Processes
Published on Aug 20, 20151726 Views
We consider reinforcement learning in parameterized Markov Decision Processes (MDPs), where the parameterization may induce correlation across transition probabilities or rewards. Consequently, obser
Related categories
Chapter list
Thompson Sampling for Learning Parameterized Markov Decision Processes00:00
Online Reinforcement Learning - 100:02
Online Reinforcement Learning - 200:15
Online Reinforcement Learning - 300:19
Online Reinforcement Learning - 400:28
Online Reinforcement Learning - 500:34
Online Reinforcement Learning - 600:38
Online Reinforcement Learning - 700:42
Online Reinforcement Learning - 800:44
Online Reinforcement Learning - 900:45
Online Reinforcement Learning - 1000:47
Online Reinforcement Learning - 1100:48
Online Reinforcement Learning - 1200:49
Online Reinforcement Learning - 1300:50
Online Reinforcement Learning - 1400:51
Online Reinforcement Learning - 1500:51
Online Reinforcement Learning - 1600:53
Online Reinforcement Learning - 1700:54
Online Reinforcement Learning - 1800:54
Online Reinforcement Learning - 1900:55
Online Reinforcement Learning - 2000:56
Thompson Sampling [Thompson 1933] - 101:36
Thompson Sampling - 101:57
Thompson Sampling - 202:01
Thompson Sampling - 302:16
Main Result - 102:27
Main Result - 202:53