Supported Algorithms¶

Following algorithms are supported:

Multi-task SAC
Multi-task SAC with Task Encoder
Multi-headed SAC
Distral from Distral: Robust multitask reinforcement learning [TBC+17]
PCGrad from Gradient surgery for multi-task learning [YKG+20]
GradNorm from Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks [CBLR18]
DeepMDP from DeepMDP: Learning Continuous Latent Space Models for Representation Learning [GKB+19]
HiPBMDP from Multi-Task Reinforcement Learning as a Hidden-Parameter Block MDP [ZSKP20]
Soft Modularization from Multi-Task Reinforcement Learning with Soft Modularization [YXWW20]
CARE

Along with the standard SAC components (actor, critic, etc), following components are supported and can be used with the base algorithms in plug-and-play fashion:

Task Encoder
State Encoders
- Attention weighted Mixture of Encoders
- Gated Mixture of Encoders
- Ensemble of Encoders
- FiLM Encoders [PSDV+18]
Multi-headed actor, critic, value-fuctions
Modularized actor, critic, value-fuctions based on [YXWW20]

For example, we can train a Multi-task SAC with FiLM encoders or train Multi-headed SAC with gated mixture of encoders with or without using task encoders.

Refer to the `tutorial <>`_ for more details.

References¶

CBLR18: Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. Gradnorm: gradient normalization for adaptive loss balancing in deep multitask networks. In International Conference on Machine Learning, 794–803. PMLR, 2018.
GKB+19: Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, and Marc G Bellemare. Deepmdp: learning continuous latent space models for representation learning. In International Conference on Machine Learning, 2170–2179. PMLR, 2019.
PSDV+18: Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. Film: visual reasoning with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32. 2018.
TBC+17: Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, and Razvan Pascanu. Distral: robust multitask reinforcement learning. arXiv preprint arXiv:1707.04175, 2017.
YXWW20(1,2): Ruihan Yang, Huazhe Xu, Yi Wu, and Xiaolong Wang. Multi-task reinforcement learning with soft modularization. arXiv preprint arXiv:2003.13661, 2020.
YKG+20: Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. Gradient surgery for multi-task learning. arXiv preprint arXiv:2001.06782, 2020.
YQH+20: Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Karol Hausman, Chelsea Finn, and Sergey Levine. Meta-world: a benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on Robot Learning, 1094–1100. PMLR, 2020.
ZSKP20: Amy Zhang, Shagun Sodhani, Khimya Khetarpal, and Joelle Pineau. Multi-task reinforcement learning as a hidden-parameter block mdp. arXiv preprint arXiv:2007.07206, 2020.