Active Reinforcement Learning

Published on 2008-08-065242 Views

Arkady Epshteyn

When the transition probabilities and rewards of a Markov Decision Process (MDP) are known, the agent can obtain the optimal policy without any interaction with the environment. However, exact transit

Reinforcement Learning

Related categories