Skip to the content.

Home > Chapter 8: Planning and Learning with Tabular Methods

rlai.core.environments.mdp.MdpPlanningEnvironment

An MDP planning environment, used to generate simulated experience based on a model of the MDP that is learned
    through direct experience with the actual environment.

rlai.planning.environment_models.EnvironmentModel

An environment model.

rlai.core.environments.mdp.PrioritizedSweepingMdpPlanningEnvironment

State-action transitions are prioritized based on the degree to which learning updates their values, and transitions
    with the highest priority are explored during planning.

rlai.planning.environment_models.StochasticEnvironmentModel

A stochastic environment model.

rlai.core.environments.mdp.TrajectorySamplingMdpPlanningEnvironment

State-action transitions are selected by the agent based on the agent's policy, and the selected transitions are
    explored during planning.