ray.rllib.env.single_agent_episode.SingleAgentEpisode.slice#
- SingleAgentEpisode.slice(slice_: slice, *, len_lookback_buffer: int | None = None) SingleAgentEpisode [source]#
Returns a slice of this episode with the given slice object.
For example, if
self
contains o0 (the reset observation), o1, o2, o3, and o4 and the actions a1, a2, a3, and a4 (len ofself
is 4), then a call toself.slice(slice(1, 3))
would return a new SingleAgentEpisode with observations o1, o2, and o3, and actions a2 and a3. Note here that there is always one observation more in an episode than there are actions (and rewards and extra model outputs) due to the initial observation received after an env reset.from ray.rllib.env.single_agent_episode import SingleAgentEpisode from ray.rllib.utils.test_utils import check # Generate a simple multi-agent episode. observations = [0, 1, 2, 3, 4, 5] actions = [1, 2, 3, 4, 5] rewards = [0.1, 0.2, 0.3, 0.4, 0.5] episode = SingleAgentEpisode( observations=observations, actions=actions, rewards=rewards, len_lookback_buffer=0, # all given data is part of the episode ) slice_1 = episode[:1] check(slice_1.observations, [0, 1]) check(slice_1.actions, [1]) check(slice_1.rewards, [0.1]) slice_2 = episode[-2:] check(slice_2.observations, [3, 4, 5]) check(slice_2.actions, [4, 5]) check(slice_2.rewards, [0.4, 0.5])
- Parameters:
slice – The slice object to use for slicing. This should exclude the lookback buffer, which will be prepended automatically to the returned slice.
len_lookback_buffer – If not None, forces the returned slice to try to have this number of timesteps in its lookback buffer (if available). If None (default), tries to make the returned slice’s lookback as large as the current lookback buffer of this episode (
self
).
- Returns:
The new SingleAgentEpisode representing the requested slice.