ray.rllib.core.rl_module.default_model_config.DefaultModelConfig#
- class ray.rllib.core.rl_module.default_model_config.DefaultModelConfig(fcnet_hiddens: ~typing.List[int] = <factory>, fcnet_activation: str = 'tanh', fcnet_kernel_initializer: str | ~typing.Callable | None = None, fcnet_kernel_initializer_kwargs: dict | None = None, fcnet_bias_initializer: str | ~typing.Callable | None = None, fcnet_bias_initializer_kwargs: dict | None = None, conv_filters: ~typing.List[~typing.Tuple[int, int | ~typing.Tuple[int, int], int | ~typing.Tuple[int, int]]] | None = None, conv_activation: str = 'relu', conv_kernel_initializer: str | ~typing.Callable | None = None, conv_kernel_initializer_kwargs: dict | None = None, conv_bias_initializer: str | ~typing.Callable | None = None, conv_bias_initializer_kwargs: dict | None = None, head_fcnet_hiddens: ~typing.List[int] = <factory>, head_fcnet_activation: str = 'relu', head_fcnet_kernel_initializer: str | ~typing.Callable | None = None, head_fcnet_kernel_initializer_kwargs: dict | None = None, head_fcnet_bias_initializer: str | ~typing.Callable | None = None, head_fcnet_bias_initializer_kwargs: dict | None = None, free_log_std: bool = False, log_std_clip_param: float = 20.0, vf_share_layers: bool = True, use_lstm: bool = False, max_seq_len: int = 20, lstm_cell_size: int = 256, lstm_use_prev_action: bool = False, lstm_use_prev_reward: bool = False, lstm_kernel_initializer: str | ~typing.Callable | None = None, lstm_kernel_initializer_kwargs: dict | None = None, lstm_bias_initializer: str | ~typing.Callable | None = None, lstm_bias_initializer_kwargs: dict | None = None)[source]#
- Dataclass to configure all default RLlib RLModules. - Users should NOT use this class for configuring their own custom RLModules, but use a custom - model_configdict with arbitrary (str) keys passed into the- RLModuleSpecused to define the custom RLModule. For example:- import gymnasium as gym import numpy as np from ray.rllib.core.rl_module.rl_module import RLModuleSpec from ray.rllib.examples.rl_modules.classes.tiny_atari_cnn_rlm import ( TinyAtariCNN ) my_rl_module = RLModuleSpec( module_class=TinyAtariCNN, observation_space=gym.spaces.Box(-1.0, 1.0, (64, 64, 4), np.float32), action_space=gym.spaces.Discrete(7), # DreamerV3-style stack working on a 64x64, color or 4x-grayscale-stacked, # normalized image. model_config={ "conv_filters": [[16, 4, 2], [32, 4, 2], [64, 4, 2], [128, 4, 2]], }, ).build() - Only RLlib’s default RLModules (defined by the various algorithms) should use this dataclass. Pass an instance of it into your algorithm config like so: - from ray.rllib.algorithms.ppo import PPOConfig from ray.rllib.core.rl_module.default_model_config import DefaultModelConfig config = ( PPOConfig() .rl_module( model_config=DefaultModelConfig(fcnet_hiddens=[32, 32]), ) ) - DeveloperAPI: This API may change across minor Ray releases. - Methods - Attributes - Activation function descriptor for the stack configured by - conv_filters.- Initializer function or class descriptor for the bias vectors in the stack configured by - conv_filters.- Kwargs passed into the initializer function defined through - conv_bias_initializer.- List of lists of format [num_out_channels, kernel, stride] defining a Conv2D stack if the input space is 2D. - Initializer function or class descriptor for the weight/kernel matrices in the stack configured by - conv_filters.- Kwargs passed into the initializer function defined through - conv_kernel_initializer.- Activation function descriptor for the stack configured by - fcnet_hiddens.- Initializer function or class descriptor for the bias vectors in the stack configured by - fcnet_hiddens.- Kwargs passed into the initializer function defined through - fcnet_bias_initializer.- Initializer function or class descriptor for the weight/kernel matrices in the stack configured by - fcnet_hiddens.- Kwargs passed into the initializer function defined through - fcnet_kernel_initializer.- If True, for DiagGaussian action distributions (or any other continuous control distribution), make the second half of the policy's outputs a "free" bias parameter, rather than state-/NN-dependent nodes. - Activation function descriptor for the stack configured by - head_fcnet_hiddens.- Initializer function or class descriptor for the bias vectors in the stack configured by - head_fcnet_hiddens.- Kwargs passed into the initializer function defined through - head_fcnet_bias_initializer.- Initializer function or class descriptor for the weight/kernel matrices in the stack configured by - head_fcnet_hiddens.- Kwargs passed into the initializer function defined through - head_fcnet_kernel_initializer.- Whether to clip the log(stddev) when using a DiagGaussian action distribution (or any other continuous control distribution). - Initializer function or class descriptor for the bias vectors in the stack configured by the LSTM layer. - Kwargs passed into the initializer function defined through - lstm_bias_initializer.- The size of the LSTM cell. - Initializer function or class descriptor for the weight/kernel matrices in the LSTM layer. - Kwargs passed into the initializer function defined through - lstm_kernel_initializer.- The maximum seq len for building the train batch for an LSTM model. - Whether to wrap the encoder component (defined by - fcnet_hiddensor- conv_filters) with an LSTM.- Whether encoder layers (defined by - fcnet_hiddensor- conv_filters) should be shared between policy- and value function.- List containing the sizes (number of nodes) of a fully connected (MLP) stack. - List containing the sizes (number of nodes) of a fully connected (MLP) head (ex.