mtrl package¶
Subpackages¶
- mtrl.agent package
- Subpackages
- mtrl.agent.components package
- Submodules
- mtrl.agent.components.actor module
- mtrl.agent.components.base module
- mtrl.agent.components.critic module
- mtrl.agent.components.decoder module
- mtrl.agent.components.encoder module
- mtrl.agent.components.hipbmdp_theta module
- mtrl.agent.components.moe_layer module
- mtrl.agent.components.reward_decoder module
- mtrl.agent.components.scripted_soft_modularization module
- mtrl.agent.components.soft_modularization module
- mtrl.agent.components.task_encoder module
- mtrl.agent.components.transition_model module
- Module contents
- mtrl.agent.ds package
- mtrl.agent.components package
- Submodules
- mtrl.agent.abstract module
- mtrl.agent.deepmdp module
- mtrl.agent.distral module
- mtrl.agent.grad_manipulation module
- mtrl.agent.gradnorm module
- mtrl.agent.hipbmdp module
- mtrl.agent.pcgrad module
- mtrl.agent.sac module
- mtrl.agent.sac_ae module
- mtrl.agent.utils module
- mtrl.agent.wrapper module
- Module contents
- Subpackages
- mtrl.app package
- mtrl.env package
- mtrl.experiment package
- mtrl.utils package
Submodules¶
mtrl.logger module¶
-
class
mtrl.logger.
AverageMeter
[source]¶ Bases:
mtrl.logger.Meter
-
class
mtrl.logger.
CurrentMeter
[source]¶ Bases:
mtrl.logger.Meter
mtrl.replay_buffer module¶
-
class
mtrl.replay_buffer.
ReplayBuffer
(env_obs_shape, task_obs_shape, action_shape, capacity, batch_size, device)[source]¶ Bases:
object
Buffer to store environment transitions.
-
sample
(index=None) → mtrl.replay_buffer.ReplayBufferSample[source]¶
-
sample_an_index
(index, total_number_of_environments) → mtrl.replay_buffer.ReplayBufferSample[source]¶ Return env_observations for only the given index
-
-
class
mtrl.replay_buffer.
ReplayBufferSample
(env_obs: torch.Tensor, action: torch.Tensor, reward: torch.Tensor, next_env_obs: torch.Tensor, not_done: torch.Tensor, task_obs: torch.Tensor, buffer_index: torch.Tensor)[source]¶ Bases:
object
-
action
¶
-
buffer_index
¶
-
env_obs
¶
-
next_env_obs
¶
-
not_done
¶
-
reward
¶
-
task_obs
¶
-