ray.rllib.offline.offline_prelearner.OfflinePreLearner#
- class ray.rllib.offline.offline_prelearner.OfflinePreLearner(*, config: AlgorithmConfig, learner: Learner | list[ActorHandle], spaces: Tuple[gymnasium.Space, gymnasium.Space] | None = None, module_spec: MultiRLModuleSpec | None = None, module_state: Dict[str, Any] | None = None, **kwargs: Dict[str, Any])[source]#
- Class that coordinates data transformation from dataset to learner. - This class is an essential part of the new - Offline RL APIof- RLlib. It is a callable class that is run in- ray.data.Dataset.map_batcheswhen iterating over batches for training. It’s basic function is to convert data in batch from rows to episodes (- SingleAGentEpisode`s for now) and to then run the learner connector pipeline to convert further to trainable batches. These batches are used directly in the `Learner’s- updatemethod.- The main reason to run these transformations inside of - map_batchesis for better performance. Batches can be pre-fetched in- ray.dataand therefore batch trransformation can be run highly parallelized to the- Learner''s `update.- This class can be overridden to implement custom logic for transforming batches and make them ‘Learner’-ready. When deriving from this class the - __call__method and- _map_to_episodescan be overridden to induce custom logic for the complete transformation pipeline (- __call__) or for converting to episodes only (‘_map_to_episodes`). For an example how this class can be used to also compute values and advantages see- rllib.algorithm.marwil.marwil_prelearner.MAWRILOfflinePreLearner.- Custom - OfflinePreLearnerclasses can be passed into- AlgorithmConfig.offline’s- prelearner_class. The- OfflineDataclass will then use the custom class in its data pipeline.- PublicAPI (alpha): This API is in alpha and may change before becoming stable. - Methods - Attributes - Sets the default replay buffer. - Sets the default arguments for the replay buffer.