Supported Algorithms¶
Following algorithms are supported:
Multi-task SAC
Multi-task SAC with Task Encoder
Multi-headed SAC
Distral from Distral: Robust multitask reinforcement learning [TBC+17]
PCGrad from Gradient surgery for multi-task learning [YKG+20]
GradNorm from Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks [CBLR18]
DeepMDP from DeepMDP: Learning Continuous Latent Space Models for Representation Learning [GKB+19]
HiPBMDP from Multi-Task Reinforcement Learning as a Hidden-Parameter Block MDP [ZSKP20]
Soft Modularization from Multi-Task Reinforcement Learning with Soft Modularization [YXWW20]
CARE
Along with the standard SAC components (actor, critic, etc), following components are supported and can be used with the base algorithms in plug-and-play fashion:
Task Encoder
State Encoders
Attention weighted Mixture of Encoders
Gated Mixture of Encoders
Ensemble of Encoders
FiLM Encoders [PSDV+18]
Multi-headed actor, critic, value-fuctions
Modularized actor, critic, value-fuctions based on [YXWW20]
For example, we can train a Multi-task SAC with FiLM encoders or train Multi-headed SAC with gated mixture of encoders with or without using task encoders.
Refer to the `tutorial <>`_ for more details.
References¶
- CBLR18
Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. Gradnorm: gradient normalization for adaptive loss balancing in deep multitask networks. In International Conference on Machine Learning, 794–803. PMLR, 2018.
- GKB+19
Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, and Marc G Bellemare. Deepmdp: learning continuous latent space models for representation learning. In International Conference on Machine Learning, 2170–2179. PMLR, 2019.
- PSDV+18
Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. Film: visual reasoning with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32. 2018.
- TBC+17
Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, and Razvan Pascanu. Distral: robust multitask reinforcement learning. arXiv preprint arXiv:1707.04175, 2017.
- YXWW20(1,2)
Ruihan Yang, Huazhe Xu, Yi Wu, and Xiaolong Wang. Multi-task reinforcement learning with soft modularization. arXiv preprint arXiv:2003.13661, 2020.
- YKG+20
Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. Gradient surgery for multi-task learning. arXiv preprint arXiv:2001.06782, 2020.
- YQH+20
Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Karol Hausman, Chelsea Finn, and Sergey Levine. Meta-world: a benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on Robot Learning, 1094–1100. PMLR, 2020.
- ZSKP20
Amy Zhang, Shagun Sodhani, Khimya Khetarpal, and Joelle Pineau. Multi-task reinforcement learning as a hidden-parameter block mdp. arXiv preprint arXiv:2007.07206, 2020.