mtrl.experiment package¶
Submodules¶
mtrl.experiment.dmcontrol module¶
Class to interface with an Experiment
-
class
mtrl.experiment.dmcontrol.
Experiment
(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]¶ Bases:
mtrl.experiment.multitask.Experiment
Experiment Class to manage the lifecycle of a multi-task model.
- Parameters
config (ConfigType) –
experiment_id (str, optional) – Defaults to “0”.
-
evaluate_vec_env_of_tasks
(vec_env: mtrl.env.vec_env.VecEnv, step: int, episode: int)[source]¶ Evaluate the agent’s performance on the different environments, vectorized as a single instance of vectorized environment.
Since we are evaluating on multiple tasks, we track additional metadata to track which metric corresponds to which task.
- Parameters
vec_env (VecEnv) – vectorized environment.
step (int) – step for tracking the training of the agent.
episode (int) – episode for tracking the training of the agent.
mtrl.experiment.experiment module¶
Experiment class manages the lifecycle of a model.
-
class
mtrl.experiment.experiment.
Experiment
(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]¶ Bases:
mtrl.utils.checkpointable.Checkpointable
Experiment Class to manage the lifecycle of a model.
- Parameters
config (ConfigType) –
experiment_id (str, optional) – Defaults to “0”.
-
build_envs
() → Tuple[Dict[str, mtrl.env.vec_env.VecEnv], Dict[str, gym.spaces.box.Box]][source]¶ Subclasses should implement this method to build the environments.
- Raises
NotImplementedError – this method should be implemented by the subclasses.
- Returns
Tuple of environment dictionary and environment metadata.
- Return type
Tuple[EnvsDictType, EnvMetaDataType]
-
periodic_save
(epoch: int) → None[source]¶ Perioridically save the experiment.
This is a utility method, built on top of the save method. It performs an extra check of wether the experiment is configured to be saved during the current epoch. :param epoch: current epoch. :type epoch: int
-
mtrl.experiment.experiment.
get_env_metadata
(env: gym.vector.async_vector_env.AsyncVectorEnv, max_episode_steps: Optional[int] = None, ordered_task_list: Optional[List[str]] = None) → Dict[str, gym.spaces.box.Box][source]¶ Method to get the metadata from an environment
-
mtrl.experiment.experiment.
prepare_config
(config: omegaconf.dictconfig.DictConfig, env_metadata: Dict[str, gym.spaces.box.Box]) → omegaconf.dictconfig.DictConfig[source]¶ Infer some config attributes during runtime.
- Parameters
config (ConfigType) – config to update.
env_metadata (EnvMetaDataType) – metadata of the environment.
- Returns
updated config.
- Return type
ConfigType
mtrl.experiment.metaworld module¶
Class to interface with an Experiment
-
class
mtrl.experiment.metaworld.
Experiment
(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]¶ Bases:
mtrl.experiment.multitask.Experiment
Experiment Class
Experiment Class to manage the lifecycle of a multi-task model.
- Parameters
config (ConfigType) –
experiment_id (str, optional) – Defaults to “0”.
-
collect_trajectory
(vec_env: mtrl.env.vec_env.VecEnv, num_steps: int) → None[source]¶ Collect some trajectories, by unrolling the policy (in train mode), and update the replay buffer. :param vec_env: environment to collect data from. :type vec_env: VecEnv :param num_steps: number of steps to collect data for. :type num_steps: int
-
create_eval_modes_to_env_ids
()[source]¶ Map each eval mode to a list of environment index.
The eval modes are of the form eval_xyz where xyz specifies the specific type of evaluation. For example. eval_interpolation means that we are using interpolation environments for evaluation. The eval moe can also be set to just eval.
- Returns
- dictionary with different eval modes as
keys and list of environment index as values.
- Return type
Dict[str, List[int]]
-
evaluate_vec_env_of_tasks
(vec_env: mtrl.env.vec_env.VecEnv, step: int, episode: int)[source]¶ Evaluate the agent’s performance on the different environments, vectorized as a single instance of vectorized environment.
Since we are evaluating on multiple tasks, we track additional metadata to track which metric corresponds to which task.
- Parameters
vec_env (VecEnv) – vectorized environment.
step (int) – step for tracking the training of the agent.
episode (int) – episode for tracking the training of the agent.
mtrl.experiment.multitask module¶
Experiment class manages the lifecycle of a multi-task model.
-
class
mtrl.experiment.multitask.
Experiment
(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]¶ Bases:
mtrl.experiment.experiment.Experiment
Experiment Class to manage the lifecycle of a multi-task model.
- Parameters
config (ConfigType) –
experiment_id (str, optional) – Defaults to “0”.
-
build_envs
() → Tuple[Dict[str, mtrl.env.vec_env.VecEnv], Dict[str, gym.spaces.box.Box]][source]¶ Build environments and return env-related metadata
-
collect_trajectory
(vec_env: mtrl.env.vec_env.VecEnv, num_steps: int) → None[source]¶ Collect some trajectories, by unrolling the policy (in train mode), and update the replay buffer. :param vec_env: environment to collect data from. :type vec_env: VecEnv :param num_steps: number of steps to collect data for. :type num_steps: int
-
create_eval_modes_to_env_ids
() → Dict[str, List[int]][source]¶ Map each eval mode to a list of environment index.
The eval modes are of the form eval_xyz where xyz specifies the specific type of evaluation. For example. eval_interpolation means that we are using interpolation environments for evaluation. The eval moe can also be set to just eval.
- Returns
- dictionary with different eval modes as
keys and list of environment index as values.
- Return type
Dict[str, List[int]]
mtrl.experiment.utils module¶
-
mtrl.experiment.utils.
clear
(config: omegaconf.dictconfig.DictConfig) → None[source]¶ Clear an experiment and delete all its data/metadata/logs given a config
- Parameters
config (ConfigType) – config of the experiment to be cleared
-
mtrl.experiment.utils.
get_dirs_to_delete_from_experiment
(config: omegaconf.dictconfig.DictConfig) → List[str][source]¶ - Return a list of dirs that should be deleted when clearing an
experiment
- Parameters
config (ConfigType) – config of the experiment to be cleared
- Returns
List of directories to be deleted
- Return type
List[str]