Sample-Based Learning and Search with Permanent and Transient Memories
Published on Aug 12, 20083491 Views
We present a reinforcement learning architecture, Dyna-2, that encompasses both sample-based learning and sample-based search, and that generalises across states during both learning and search. We ap