Home > Chapter 8: Planning and Learning with Tabular Methods
rlai.core.environments.mdp.MdpPlanningEnvironment
An MDP planning environment, used to generate simulated experience based on a model of the MDP that is learned
through direct experience with the actual environment.
rlai.core.environments.mdp.EnvironmentModel
An environment model.
rlai.core.environments.mdp.PrioritizedSweepingMdpPlanningEnvironment
State-action transitions are prioritized based on the degree to which learning updates their values, and transitions
with the highest priority are explored during planning.
rlai.core.environments.mdp.StochasticEnvironmentModel
A stochastic environment model.
rlai.core.environments.mdp.TrajectorySamplingMdpPlanningEnvironment
State-action transitions are selected by the agent based on the agent's policy, and the selected transitions are
explored during planning.