ray.rllib.env.single_agent_episode.SingleAgentEpisode.get_infos#
- SingleAgentEpisode.get_infos(indices: int | slice | List[int] | None = None, *, neg_index_as_lookback: bool = False, fill: Any | None = None) Any [source]#
Returns individual info dicts or list (ranges) thereof from this episode.
- Parameters:
indices – A single int is interpreted as an index, from which to return the individual info dict stored at this index. A list of ints is interpreted as a list of indices from which to gather individual info dicts in a list of size len(indices). A slice object is interpreted as a range of info dicts to be returned. Thereby, negative indices by default are interpreted as “before the end” unless the
neg_index_as_lookback=True
option is used, in which case negative indices are interpreted as “before ts=0”, meaning going back into the lookback buffer. If None, will return all infos (from ts=0 to the end).neg_index_as_lookback – If True, negative values in
indices
are interpreted as “before ts=0”, meaning going back into the lookback buffer. For example, an episode with infos [{“l”:4}, {“l”:5}, {“l”:6}, {“a”:7}, {“b”:8}, {“c”:9}], where the first 3 items are the lookback buffer (ts=0 item is {“a”: 7}), will respond toget_infos(-1, neg_index_as_lookback=True)
with{"l":6}
and toget_infos(slice(-2, 1), neg_index_as_lookback=True)
with[{"l":5}, {"l":6}, {"a":7}]
.fill – An optional value to use for filling up the returned results at the boundaries. This filling only happens if the requested index range’s start/stop boundaries exceed the episode’s boundaries (including the lookback buffer on the left side). This comes in very handy, if users don’t want to worry about reaching such boundaries and want to auto-fill. For example, an episode with infos [{“l”:10}, {“l”:11}, {“a”:12}, {“b”:13}, {“c”:14}] and lookback buffer size of 2 (meaning infos {“l”:10}, {“l”:11} are part of the lookback buffer) will respond to
get_infos(slice(-7, -2), fill={"o": 0.0})
with[{"o":0.0}, {"o":0.0}, {"l":10}, {"l":11}, {"a":12}]
.
Examples:
from ray.rllib.env.single_agent_episode import SingleAgentEpisode episode = SingleAgentEpisode( infos=[{"a":0}, {"b":1}, {"c":2}, {"d":3}], # The following is needed, but not relevant for this demo. observations=[0, 1, 2, 3], actions=[1, 2, 3], rewards=[1, 2, 3], len_lookback_buffer=0, # no lookback; all data is actually "in" episode ) # Plain usage (`indices` arg only). episode.get_infos(-1) # {"d":3} episode.get_infos(0) # {"a":0} episode.get_infos([0, 2]) # [{"a":0},{"c":2}] episode.get_infos([-1, 0]) # [{"d":3},{"a":0}] episode.get_infos(slice(None, 2)) # [{"a":0},{"b":1}] episode.get_infos(slice(-2, None)) # [{"c":2},{"d":3}] # Using `fill=...` (requesting slices beyond the boundaries). # TODO (sven): This would require a space being provided. Maybe we can # skip this check for infos, which don't have a space anyways. # episode.get_infos(slice(-5, -3), fill={"o":-1}) # [{"o":-1},{"a":0}] # episode.get_infos(slice(3, 5), fill={"o":-2}) # [{"d":3},{"o":-2}]
- Returns:
The collected info dicts. As a 0-axis batch, if there are several
indices
or a list of exactly one index provided ORindices
is a slice object. As single item (B=0 -> no additional 0-axis) ifindices
is a single int.