mtrl.experiment package

Submodules

mtrl.experiment.dmcontrol module

Class to interface with an Experiment

class mtrl.experiment.dmcontrol.Experiment(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]

Bases: mtrl.experiment.multitask.Experiment

Experiment Class to manage the lifecycle of a multi-task model.

Parameters
  • config (ConfigType) –

  • experiment_id (str, optional) – Defaults to “0”.

evaluate_vec_env_of_tasks(vec_env: mtrl.env.vec_env.VecEnv, step: int, episode: int)[source]

Evaluate the agent’s performance on the different environments, vectorized as a single instance of vectorized environment.

Since we are evaluating on multiple tasks, we track additional metadata to track which metric corresponds to which task.

Parameters
  • vec_env (VecEnv) – vectorized environment.

  • step (int) – step for tracking the training of the agent.

  • episode (int) – episode for tracking the training of the agent.

get_action_when_evaluating_vec_env_of_tasks(multitask_obs: Dict[str, Union[numpy.ndarray, str, int, float]], modes: List[str]) → torch.Tensor[source]

mtrl.experiment.experiment module

Experiment class manages the lifecycle of a model.

class mtrl.experiment.experiment.Experiment(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]

Bases: mtrl.utils.checkpointable.Checkpointable

Experiment Class to manage the lifecycle of a model.

Parameters
  • config (ConfigType) –

  • experiment_id (str, optional) – Defaults to “0”.

build_envs() → Tuple[Dict[str, mtrl.env.vec_env.VecEnv], Dict[str, gym.spaces.box.Box]][source]

Subclasses should implement this method to build the environments.

Raises

NotImplementedError – this method should be implemented by the subclasses.

Returns

Tuple of environment dictionary and environment metadata.

Return type

Tuple[EnvsDictType, EnvMetaDataType]

close_envs()[source]

Close all the environments.

load(epoch: Optional[int]) → Any[source]

Load the object from a checkpoint.

Returns

Any

periodic_save(epoch: int) → None[source]

Perioridically save the experiment.

This is a utility method, built on top of the save method. It performs an extra check of wether the experiment is configured to be saved during the current epoch. :param epoch: current epoch. :type epoch: int

run() → None[source]

Run the experiment.

Raises

NotImplementedError – This method should be implemented by the subclasses.

save(epoch: int) → Any[source]

Save the object to a checkpoint.

Returns

Any

startup_logs() → None[source]

Write some logs at the start of the experiment.

mtrl.experiment.experiment.get_env_metadata(env: gym.vector.async_vector_env.AsyncVectorEnv, max_episode_steps: Optional[int] = None, ordered_task_list: Optional[List[str]] = None) → Dict[str, gym.spaces.box.Box][source]

Method to get the metadata from an environment

mtrl.experiment.experiment.prepare_config(config: omegaconf.dictconfig.DictConfig, env_metadata: Dict[str, gym.spaces.box.Box]) → omegaconf.dictconfig.DictConfig[source]

Infer some config attributes during runtime.

Parameters
  • config (ConfigType) – config to update.

  • env_metadata (EnvMetaDataType) – metadata of the environment.

Returns

updated config.

Return type

ConfigType

mtrl.experiment.metaworld module

Class to interface with an Experiment

class mtrl.experiment.metaworld.Experiment(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]

Bases: mtrl.experiment.multitask.Experiment

Experiment Class

Experiment Class to manage the lifecycle of a multi-task model.

Parameters
  • config (ConfigType) –

  • experiment_id (str, optional) – Defaults to “0”.

build_envs()[source]

Build environments and return env-related metadata

collect_trajectory(vec_env: mtrl.env.vec_env.VecEnv, num_steps: int) → None[source]

Collect some trajectories, by unrolling the policy (in train mode), and update the replay buffer. :param vec_env: environment to collect data from. :type vec_env: VecEnv :param num_steps: number of steps to collect data for. :type num_steps: int

create_env_id_to_index_map() → Dict[str, int][source]
create_eval_modes_to_env_ids()[source]

Map each eval mode to a list of environment index.

The eval modes are of the form eval_xyz where xyz specifies the specific type of evaluation. For example. eval_interpolation means that we are using interpolation environments for evaluation. The eval moe can also be set to just eval.

Returns

dictionary with different eval modes as

keys and list of environment index as values.

Return type

Dict[str, List[int]]

evaluate_vec_env_of_tasks(vec_env: mtrl.env.vec_env.VecEnv, step: int, episode: int)[source]

Evaluate the agent’s performance on the different environments, vectorized as a single instance of vectorized environment.

Since we are evaluating on multiple tasks, we track additional metadata to track which metric corresponds to which task.

Parameters
  • vec_env (VecEnv) – vectorized environment.

  • step (int) – step for tracking the training of the agent.

  • episode (int) – episode for tracking the training of the agent.

mtrl.experiment.multitask module

Experiment class manages the lifecycle of a multi-task model.

class mtrl.experiment.multitask.Experiment(config: omegaconf.dictconfig.DictConfig, experiment_id: str = '0')[source]

Bases: mtrl.experiment.experiment.Experiment

Experiment Class to manage the lifecycle of a multi-task model.

Parameters
  • config (ConfigType) –

  • experiment_id (str, optional) – Defaults to “0”.

build_envs() → Tuple[Dict[str, mtrl.env.vec_env.VecEnv], Dict[str, gym.spaces.box.Box]][source]

Build environments and return env-related metadata

collect_trajectory(vec_env: mtrl.env.vec_env.VecEnv, num_steps: int) → None[source]

Collect some trajectories, by unrolling the policy (in train mode), and update the replay buffer. :param vec_env: environment to collect data from. :type vec_env: VecEnv :param num_steps: number of steps to collect data for. :type num_steps: int

create_eval_modes_to_env_ids() → Dict[str, List[int]][source]

Map each eval mode to a list of environment index.

The eval modes are of the form eval_xyz where xyz specifies the specific type of evaluation. For example. eval_interpolation means that we are using interpolation environments for evaluation. The eval moe can also be set to just eval.

Returns

dictionary with different eval modes as

keys and list of environment index as values.

Return type

Dict[str, List[int]]

run()[source]

Run the experiment.

mtrl.experiment.utils module

mtrl.experiment.utils.clear(config: omegaconf.dictconfig.DictConfig) → None[source]

Clear an experiment and delete all its data/metadata/logs given a config

Parameters

config (ConfigType) – config of the experiment to be cleared

mtrl.experiment.utils.get_dirs_to_delete_from_experiment(config: omegaconf.dictconfig.DictConfig) → List[str][source]
Return a list of dirs that should be deleted when clearing an

experiment

Parameters

config (ConfigType) – config of the experiment to be cleared

Returns

List of directories to be deleted

Return type

List[str]

mtrl.experiment.utils.prepare_and_run(config: omegaconf.dictconfig.DictConfig) → None[source]

Prepare an experiment and run the experiment.

Parameters

config (ConfigType) – config of the experiment

Module contents