ray.rllib.env.single_agent_episode.SingleAgentEpisode.cut#

SingleAgentEpisode.cut(len_lookback_buffer: int = 0) SingleAgentEpisode[source]#

Returns a successor episode chunk (of len=0) continuing from this Episode.

The successor will have the same ID as self. If no lookback buffer is requested (len_lookback_buffer=0), the successor’s observations will be the last observation(s) of self and its length will therefore be 0 (no further steps taken yet). If len_lookback_buffer > 0, the returned successor will have len_lookback_buffer observations (and actions, rewards, etc..) taken from the right side (end) of self. For example if len_lookback_buffer=2, the returned successor’s lookback buffer actions will be identical to self.actions[-2:].

This method is useful if you would like to discontinue building an episode chunk (b/c you have to return it from somewhere), but would like to have a new episode instance to continue building the actual gym.Env episode at a later time. Vie the len_lookback_buffer argument, the continuing chunk (successor) will still be able to “look back” into this predecessor episode’s data (at least to some extend, depending on the value of len_lookback_buffer).

Parameters:

len_lookback_buffer – The number of timesteps to take along into the new chunk as “lookback buffer”. A lookback buffer is additional data on the left side of the actual episode data for visibility purposes (but without actually being part of the new chunk). For example, if self ends in actions 5, 6, 7, and 8, and we call self.cut(len_lookback_buffer=2), the returned chunk will have actions 7 and 8 already in it, but still t_started`==t==8 (not 7!) and a length of 0. If there is not enough data in `self yet to fulfil the len_lookback_buffer request, the value of len_lookback_buffer is automatically adjusted (lowered).

Returns:

The successor Episode chunk of this one with the same ID and state and the only observation being the last observation in self.