mtrl.experiment package¶

Submodules¶

mtrl.experiment.dmcontrol module¶

Class to interface with an Experiment

class mtrl.experiment.dmcontrol.Experiment(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]¶

Bases: mtrl.experiment.multitask.Experiment

Experiment Class to manage the lifecycle of a multi-task model.

Parameters

config (ConfigType) –
experiment_id (str, optional) – Defaults to “0”.

evaluate_vec_env_of_tasks(vec_env: mtrl.env.vec_env.VecEnv, step: int, episode: int)[source]¶

Evaluate the agent’s performance on the different environments, vectorized as a single instance of vectorized environment.

Since we are evaluating on multiple tasks, we track additional metadata to track which metric corresponds to which task.

Parameters

vec_env (VecEnv) – vectorized environment.
step (int) – step for tracking the training of the agent.
episode (int) – episode for tracking the training of the agent.

get_action_when_evaluating_vec_env_of_tasks(multitask_obs: Dict[str, Union[numpy.ndarray, str, int, float]], modes: List[str]) → torch.Tensor[source]¶

mtrl.experiment.experiment module¶

Experiment class manages the lifecycle of a model.

class mtrl.experiment.experiment.Experiment(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]¶

Bases: mtrl.utils.checkpointable.Checkpointable

Experiment Class to manage the lifecycle of a model.

Parameters

config (ConfigType) –
experiment_id (str, optional) – Defaults to “0”.

build_envs() → Tuple[Dict[str, mtrl.env.vec_env.VecEnv], Dict[str, gym.spaces.box.Box]][source]¶

Subclasses should implement this method to build the environments.

Raises: NotImplementedError – this method should be implemented by the subclasses.
Returns: Tuple of environment dictionary and environment metadata.
Return type: Tuple[EnvsDictType, EnvMetaDataType]

close_envs()[source]¶: Close all the environments.

load(epoch: Optional[int]) → Any[source]¶

Load the object from a checkpoint.

Returns: Any

periodic_save(epoch: int) → None[source]¶

Perioridically save the experiment.

This is a utility method, built on top of the save method. It performs an extra check of wether the experiment is configured to be saved during the current epoch. :param epoch: current epoch. :type epoch: int

run() → None[source]¶

Run the experiment.

Raises: NotImplementedError – This method should be implemented by the subclasses.

save(epoch: int) → Any[source]¶

Save the object to a checkpoint.

Returns: Any

startup_logs() → None[source]¶: Write some logs at the start of the experiment.

mtrl.experiment.experiment.get_env_metadata(env: gym.vector.async_vector_env.AsyncVectorEnv, max_episode_steps: Optional[int] = None, ordered_task_list: Optional[List[str]] = None) → Dict[str, gym.spaces.box.Box][source]¶: Method to get the metadata from an environment

mtrl.experiment.experiment.prepare_config(config: omegaconf.dictconfig.DictConfig, env_metadata: Dict[str, gym.spaces.box.Box]) → omegaconf.dictconfig.DictConfig[source]¶

Infer some config attributes during runtime.

Parameters

config (ConfigType) – config to update.
env_metadata (EnvMetaDataType) – metadata of the environment.

Returns

updated config.

Return type

ConfigType

mtrl.experiment.metaworld module¶

Class to interface with an Experiment

class mtrl.experiment.metaworld.Experiment(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]¶

Bases: mtrl.experiment.multitask.Experiment

Experiment Class

Experiment Class to manage the lifecycle of a multi-task model.

Parameters

config (ConfigType) –
experiment_id (str, optional) – Defaults to “0”.

build_envs()[source]¶: Build environments and return env-related metadata

collect_trajectory(vec_env: mtrl.env.vec_env.VecEnv, num_steps: int) → None[source]¶: Collect some trajectories, by unrolling the policy (in train mode), and update the replay buffer. :param vec_env: environment to collect data from. :type vec_env: VecEnv :param num_steps: number of steps to collect data for. :type num_steps: int

create_env_id_to_index_map() → Dict[str, int][source]¶

create_eval_modes_to_env_ids()[source]¶

Map each eval mode to a list of environment index.

The eval modes are of the form eval_xyz where xyz specifies the specific type of evaluation. For example. eval_interpolation means that we are using interpolation environments for evaluation. The eval moe can also be set to just eval.

Returns

dictionary with different eval modes as: keys and list of environment index as values.

Return type

Dict[str, List[int]]

evaluate_vec_env_of_tasks(vec_env: mtrl.env.vec_env.VecEnv, step: int, episode: int)[source]¶

Evaluate the agent’s performance on the different environments, vectorized as a single instance of vectorized environment.

Since we are evaluating on multiple tasks, we track additional metadata to track which metric corresponds to which task.

Parameters

vec_env (VecEnv) – vectorized environment.
step (int) – step for tracking the training of the agent.
episode (int) – episode for tracking the training of the agent.

mtrl.experiment.multitask module¶

Experiment class manages the lifecycle of a multi-task model.

class mtrl.experiment.multitask.Experiment(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]¶

Bases: mtrl.experiment.experiment.Experiment

Experiment Class to manage the lifecycle of a multi-task model.

Parameters

config (ConfigType) –
experiment_id (str, optional) – Defaults to “0”.

build_envs() → Tuple[Dict[str, mtrl.env.vec_env.VecEnv], Dict[str, gym.spaces.box.Box]][source]¶: Build environments and return env-related metadata

collect_trajectory(vec_env: mtrl.env.vec_env.VecEnv, num_steps: int) → None[source]¶: Collect some trajectories, by unrolling the policy (in train mode), and update the replay buffer. :param vec_env: environment to collect data from. :type vec_env: VecEnv :param num_steps: number of steps to collect data for. :type num_steps: int

create_eval_modes_to_env_ids() → Dict[str, List[int]][source]¶

Map each eval mode to a list of environment index.

The eval modes are of the form eval_xyz where xyz specifies the specific type of evaluation. For example. eval_interpolation means that we are using interpolation environments for evaluation. The eval moe can also be set to just eval.

Returns

dictionary with different eval modes as: keys and list of environment index as values.

Return type

Dict[str, List[int]]

run()[source]¶: Run the experiment.

mtrl.experiment.utils module¶

mtrl.experiment.utils.clear(config: omegaconf.dictconfig.DictConfig) → None[source]¶

Clear an experiment and delete all its data/metadata/logs given a config

Parameters: config (ConfigType) – config of the experiment to be cleared

mtrl.experiment.utils.get_dirs_to_delete_from_experiment(config: omegaconf.dictconfig.DictConfig) → List[str][source]¶

Return a list of dirs that should be deleted when clearing an: experiment

Parameters: config (ConfigType) – config of the experiment to be cleared
Returns: List of directories to be deleted
Return type: List[str]

mtrl.experiment.utils.prepare_and_run(config: omegaconf.dictconfig.DictConfig) → None[source]¶

Prepare an experiment and run the experiment.

Parameters: config (ConfigType) – config of the experiment

mtrl.experiment package¶

Submodules¶

mtrl.experiment.dmcontrol module¶

mtrl.experiment.experiment module¶

mtrl.experiment.metaworld module¶

mtrl.experiment.multitask module¶

mtrl.experiment.utils module¶

Module contents¶