rlai | This is a Python implementation of concepts and algorithms described in “Reinforcement Learning: An Introduction” (Sutton and Barto, 2018, 2nd edition).

Home > Agents

rlai.core.Agent

Base class for all agents.

rlai.core.Human

An interactive, human-driven agent that prompts for actions at each time step.

rlai.core.MdpAgent

MDP agent. Adds the concepts of state, reward discounting, and policy-based action to the base agent.

rlai.core.StochasticMdpAgent

Stochastic MDP agent. Adds random selection of actions based on probabilities specified in the agent's policy.

rlai.core.environments.robocode.RobocodeAgent

Robocode agent.

rlai.core.environments.robocode_continuous_action.RobocodeAgent

Robocode agent.

rlai.gpi.state_action_value.ActionValueMdpAgent

A stochastic MDP agent whose policy is based on action-value estimation. This agent is generally appropriate for
    discrete and continuous state spaces in which we estimate the value of actions using tabular and
    function-approximation methods, respectively. The action space need to be discrete in all of these cases. If the
    action space is continuous, then consider the `ParameterizedMdpAgent`.

rlai.policy_gradient.ParameterizedMdpAgent

A stochastic MDP agent whose policy is directly parameterized. This agent is generally appropriate when both the
    state and action spaces are continuous. If the action space is discrete, then consider the `ActionValueMdpAgent`.