MultiAgentEnvRunner API#
Note
Ray 2.40 uses RLlib’s new API stack by default. The Ray team has mostly completed transitioning algorithms, example scripts, and documentation to the new code base.
If you’re still using the old API stack, see New API stack migration guide for details on how to migrate.
rllib.env.multi_agent_env_runner.MultiAgentEnvRunner#
- class ray.rllib.env.multi_agent_env_runner.MultiAgentEnvRunner(config: AlgorithmConfig, **kwargs)[source]#
- The genetic environment runner for the multi-agent case. - PublicAPI (alpha): This API is in alpha and may change before becoming stable. - __init__(config: AlgorithmConfig, **kwargs)[source]#
- Initializes a MultiAgentEnvRunner instance. - Parameters:
- config – An - AlgorithmConfigobject containing all settings needed to build this- EnvRunnerclass.
 
 - sample(*, num_timesteps: int | None = None, num_episodes: int | None = None, explore: bool | None = None, random_actions: bool = False, force_reset: bool = False) List[MultiAgentEpisode][source]#
- Runs and returns a sample (n timesteps or m episodes) on the env(s). - Parameters:
- num_timesteps – The number of timesteps to sample during this call. Note that only one of - num_timetsepsor- num_episodesmay be provided.
- num_episodes – The number of episodes to sample during this call. Note that only one of - num_timetsepsor- num_episodesmay be provided.
- explore – If True, will use the RLModule’s - forward_exploration()method to compute actions. If False, will use the RLModule’s- forward_inference()method. If None (default), will use the- exploreboolean setting from- self.configpassed into this EnvRunner’s constructor. You can change this setting in your config via- config.env_runners(explore=True|False).
- random_actions – If True, actions will be sampled randomly (from the action space of the environment). If False (default), actions or action distribution parameters are computed by the RLModule. 
- force_reset – Whether to force-reset all (vector) environments before sampling. Useful if you would like to collect a clean slate of new episodes via this call. Note that when sampling n episodes ( - num_episodes != None), this is fixed to True.
 
- Returns:
- A list of - MultiAgentEpisodeinstances, carrying the sampled data.
 
 - get_metrics() Dict[source]#
- Returns metrics (in any form) of the thus far collected, completed episodes. - Returns:
- Metrics of any form. 
 
 - make_env()[source]#
- Creates the RL environment for this EnvRunner and assigns it to - self.env.- Note that users should be able to change the EnvRunner’s config (e.g. change - self.config.env_config) and then call this method to create new environments with the updated configuration. It should also be called after a failure of an earlier env in order to clean up the existing env (for example- close()it), re-create a new one, and then continue sampling with that new env.