Practical RL: Representation, interaction, synthesis, and morality (PRISM)

Published on 2015-07-282815 Views

Peter Stone

When scaling up Reinforcement Learning (RL) to large continuous domains with imperfect representations and hierarchical structure, we often try applying algorithm that are proven to converge in small

RLDM 2015 - Edmonton

Related categories

Presentation

Practical RL: Representation, Interaction, Synthesis and Mortiality (PRISM)00:00

RL as a Tool - 100:42

RL as a Tool - 201:01

RL as a Tool - 301:18

RL as a Tool - 401:25

RL as a Tool - 501:31

RL as a Tool - 601:32

Practical RL - 101:42

Practical RL - 302:06

Practical RL - 604:09

Practical RL - 704:17

Practical RL - 804:26

TEXPLORE: Real-Time Sample-Efficient Reinforcement Learning for Robots04:29

Reinforcement Learning - 104:42

Reinforcement Learning - 205:01

Mortality - 105:45

Mortality - 206:50

Mortality - 306:53

Fuel World06:59

Fuel World Behavior07:39

Velocity Control: Real-Time Need08:28

Velocity Control08:56

Desiderata - 109:35

Desiderata - 209:57

Common Approaches - 110:02

Common Approaches - 210:06

The TEXPLORE Algorithm10:13

Challenge 1: Sample Efﬁciency - 110:22

Challenge 1: Sample Efﬁciency - 210:34

Challenge 1: Sample Efﬁciency - 310:39

Challenge 1: Sample Efﬁciency - 410:47

Random Forest Model - 110:51

Random Forest Model - 211:19

Real-Time Model Based Architecture (RTMBA) - 113:03

Real-Time Model Based Architecture (RTMBA) - 213:25

Real-Time Model Based Architecture (RTMBA) - 313:25

Real-Time Model Based Architecture (RTMBA) - 413:30

Challenge 3: Continuous State - 113:53

Challenge 3: Continuous State - 214:00

Challenge 4: Actuator Delays - 114:10

Challenge 4: Actuator Delays - 214:13

Challenge 4: Actuator Delays - 314:34

Autonomous Vehicle14:38

Uses ROS [Quigley et al 2009]14:46

Simulation Experiments - 114:54

Simulation Experiments - 215:12

Challenge 1: Sample Efﬁciency15:14

Challenge 2: Real-Time Action Selection15:22

Challenge 3: Modeling Continuous Domains15:23

Challenge 4: Handling Delayed Actions15:23

On the physical vehicle - 115:24

On the physical vehicle - 215:30

TEXPLORE Summary16:05

Practical RL16:51

UT Austin Villa 201417:27

UT Austin Villa 2014 - 117:40

UT Austin Villa 2014 - 218:23

UT Austin Villa 2014 - 318:28

UT Austin Villa 2014 - 418:32

Layered Learning in Practice - 118:45

Layered Learning in Practice - 220:05

Layered Learning Paradigms - 121:05

Layered Learning Paradigms - 221:41

Layered Learning Paradigms - 322:00

Overlapping Layered Learning - 122:18

Overlapping Layered Learning - 222:22

Overlapping Layered Learning - 322:34

Overlapping Layered Learning - 422:42

RoboCup 3D Simulation Domain22:52

RoboCup Champions 2011, 2012 - 123:23

RoboCup Champions 2011, 2012 - 223:52

RoboCup Champions 2011, 2012 - 324:38

RoboCup Champions 2011, 2012 - 424:50

RoboCup Champions 2011, 2012 - 528:02

Learned Layers28:08

Dribbling and Kicking the Ball in the Goal29:01

Scoring on a Kickoff30:21

Impact of Overlapping Layered Learning - 131:36

Impact of Overlapping Layered Learning - 231:46

Repetition on Different Robot Types - 132:07

Repetition on Different Robot Types - 232:24

Repetition on Different Robot Types - 332:30

RoboCup 2014 - 133:03

RoboCup 2014 - 234:01

RoboCup 2014 - 334:02

RoboCup 2014 - 434:03

Making Friends on the Fly: Advances in Ad Hoc Teamwork34:19

Ad Hoc Teamwork - 134:23

Ad Hoc Teamwork - 234:35

PLASTIC34:37

Testbed Domains35:16

Practical RL - 235:50

Practical RL - 435:57

Practical RL - 536:15