ray.rllib.callbacks.callbacks.RLlibCallback.on_episode_created#
- RLlibCallback.on_episode_created(*, episode: SingleAgentEpisode | MultiAgentEpisode | EpisodeV2, worker: EnvRunner | None = None, env_runner: EnvRunner | None = None, metrics_logger: MetricsLogger | None = None, base_env: BaseEnv | None = None, env: gymnasium.Env | None = None, policies: Dict[str, Policy] | None = None, rl_module: RLModule | None = None, env_index: int, **kwargs) None[source]#
- Callback run when a new episode is created (but has not started yet!). - This method gets called after a new Episode(V2) (old stack) or MultiAgentEpisode instance has been created. This happens before the respective sub-environment’s (usually a gym.Env) - reset()is called by RLlib.- Note, at the moment this callback does not get called in the new API stack and single-agent mode. - Episode(V2)/MultiAgentEpisode created: This callback is called. 
- Respective sub-environment (gym.Env) is - reset().
- Callback - on_episode_startis called.
- Stepping through sub-environment/episode commences. 
 - Parameters:
- episode – The newly created episode. On the new API stack, this will be a MultiAgentEpisode object. On the old API stack, this will be a Episode or EpisodeV2 object. This is the episode that is about to be started with an upcoming - env.reset(). Only after this reset call, the- on_episode_startcallback will be called.
- env_runner – Replaces - workerarg. Reference to the current EnvRunner.
- metrics_logger – The MetricsLogger object inside the - env_runner. Can be used to log custom metrics after Episode creation.
- env – Replaces - base_envarg. The gym.Env (new API stack) or RLlib BaseEnv (old API stack) running the episode. On the old stack, the underlying sub environment objects can be retrieved by calling- base_env.get_sub_environments().
- rl_module – Replaces - policiesarg. Either the RLModule (new API stack) or a dict mapping policy IDs to policy objects (old stack). In single agent mode there will only be a single policy/RLModule under the- rl_module["default_policy"]key.
- env_index – The index of the sub-environment that is about to be reset (within the vector of sub-environments of the BaseEnv). 
- kwargs – Forward compatibility placeholder.