Automatic Discovery and Transfer of MAXQ Hierarchies
author:
Neville Mehta,
Oregon State University
Description
We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful trajectory from a source reinforcement learning task. HI-MAT discovers subtasks by analyzing the causal and temporal relationships among the actions in the trajectory. Under appropriate assumptions, HI-MAT induces hierarchies that are consistent with the observed trajectory and have compact value-function tables employing safe state abstractions. We demonstrate empirically that HI-MAT constructs compact hierarchies that are comparable to manually-engineered hierarchies and facilitate significant speedup in learning when transferred to a target task.
You might be experiencing some problems with Your Video player.
| Slides | |
| 0:00 | Automatic Discovery and Transfer of MAXQ Hierarchies |
| 0:07 | Motivation (1) |
| 2:06 | Motivation (2) |
| 2:46 | Our Approach: HI-MAT |
| 3:11 | Markov Decision Process |
| 3:49 | Dynamic Bayesian Network (DBN) (1) |
| 4:26 | Dynamic Bayesian Network (DBN) (2) |
| 5:18 | Hierarchical RL: MAXQ Framework |
| 6:30 | MAXQ: Execution Semantics (1) |
| 6:39 | MAXQ: Execution Semantics (2) |
| 6:45 | MAXQ: Execution Semantics (3) |
| 6:54 | MAXQ: Execution Semantics (4) |
| 7:07 | MAXQ: Execution Semantics (5) |
| 7:16 | MAXQ: Execution Semantics (6) |
| 7:22 | MAXQ: Execution Semantics (7) |
| 7:23 | MAXQ: Execution Semantics (8) |
| 7:26 | Hierarchy Learning Problem |
| 8:12 | Desired Properties |
| 9:28 | HI-MAT Algorithm (1) |
| 9:39 | HI-MAT Algorithm (2) |
| 9:47 | HI-MAT Algorithm (3) |
| 10:03 | HI-MAT Algorithm (4) |
| 10:47 | HI-MAT Algorithm (5) |
| 11:04 | HI-MAT Algorithm (6) |
| 11:26 | HI-MAT Algorithm (7) |
| 11:47 | HI-MAT Algorithm (8) |
| 11:57 | HI-MAT Algorithm (9) |
| 12:04 | HI-MAT Algorithm (10) |
| 12:15 | HI-MAT Algorithm (11) |
| 12:43 | HI-MAT Algorithm (12) |
| 13:05 | HI-MAT Algorithm (13) |
| 13:17 | HI-MAT Algorithm (14) |
| 13:29 | HI-MAT Algorithm (15) |
| 13:53 | HI-MAT Algorithm (16) |
| 14:09 | HI-MAT Algorithm (17) |
| 14:29 | Empirical Evaluation: Hypotheses |
| 15:33 | Experimental Setup: Taxi (1) |
| 16:04 | Experimental Setup: Taxi (2) |
| 16:49 | Results: Taxi (1) |
| 16:59 | Results: Taxi (2) |
| 17:29 | Results: Taxi (3) |
| 18:04 | Results: Taxi (4) |
| 18:19 | Results: Taxi (5) |
| 18:33 | Experimental Setup: Wargus (1) |
| 18:43 | Experimental Setup: Wargus (2) |
| 18:53 | Induced Wargus Hierarchy |
| 18:57 | Hand-Built Wargus Hierarchy |
| 19:10 | Induced Wargus Hierarchy |
| 19:13 | VISA’s Wargus Hierarchy |
| 19:17 | Results: Wargus (1) |
| 19:33 | Results: Wargus (2) |
| 20:13 | Contribution of the Trajectory |
| 21:01 | Modified Bitflip Domain |
| 21:33 | Modified Bitflip Domain: Example |
| 21:50 | VISA’s Causal Graph |
| 22:05 | Modified Bitflip CAT |
| 22:12 | Hierarchy Comparison |
| 22:31 | Results: 7-bit Modified Bitflip |
| 22:38 | Conclusion |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
Visitors who watched this lecture also watched...
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !



