ray.rllib.env.multi_agent_episode.MultiAgentEpisode.cut#
- MultiAgentEpisode.cut(len_lookback_buffer: int = 0) MultiAgentEpisode [source]#
Returns a successor episode chunk (of len=0) continuing from this Episode.
The successor will have the same ID as
self
. If no lookback buffer is requested (len_lookback_buffer=0), the successor’s observations will be the last observation(s) ofself
and its length will therefore be 0 (no further steps taken yet). Iflen_lookback_buffer
> 0, the returned successor will havelen_lookback_buffer
observations (and actions, rewards, etc..) taken from the right side (end) ofself
. For example iflen_lookback_buffer=2
, the returned successor’s lookback buffer actions will be identical to teh results ofself.get_actions([-2, -1])
.This method is useful if you would like to discontinue building an episode chunk (b/c you have to return it from somewhere), but would like to have a new episode instance to continue building the actual gym.Env episode at a later time. Vie the
len_lookback_buffer
argument, the continuing chunk (successor) will still be able to “look back” into this predecessor episode’s data (at least to some extend, depending on the value oflen_lookback_buffer
).- Parameters:
len_lookback_buffer – The number of environment timesteps to take along into the new chunk as “lookback buffer”. A lookback buffer is additional data on the left side of the actual episode data for visibility purposes (but without actually being part of the new chunk). For example, if
self
ends in actions: agent_1=5,6,7 and agent_2=6,7, and we callself.cut(len_lookback_buffer=2)
, the returned chunk will have actions 6 and 7 for both agents already in it, but stillt_started`==t==8 (not 7!) and a length of 0. If there is not enough data in `self
yet to fulfil thelen_lookback_buffer
request, the value oflen_lookback_buffer
is automatically adjusted (lowered).- Returns:
The successor Episode chunk of this one with the same ID and state and the only observation being the last observation in self.