Supported Algorithms

Following algorithms are supported:

  • Multi-task SAC

  • Multi-task SAC with Task Encoder

  • Multi-headed SAC

  • Distral from Distral: Robust multitask reinforcement learning [TBC+17]

  • PCGrad from Gradient surgery for multi-task learning [YKG+20]

  • GradNorm from Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks [CBLR18]

  • DeepMDP from DeepMDP: Learning Continuous Latent Space Models for Representation Learning [GKB+19]

  • HiPBMDP from Multi-Task Reinforcement Learning as a Hidden-Parameter Block MDP [ZSKP20]

  • Soft Modularization from Multi-Task Reinforcement Learning with Soft Modularization [YXWW20]

  • CARE

Along with the standard SAC components (actor, critic, etc), following components are supported and can be used with the base algorithms in plug-and-play fashion:

  • Task Encoder

  • State Encoders

    • Attention weighted Mixture of Encoders

    • Gated Mixture of Encoders

    • Ensemble of Encoders

    • FiLM Encoders [PSDV+18]

  • Multi-headed actor, critic, value-fuctions

  • Modularized actor, critic, value-fuctions based on [YXWW20]

For example, we can train a Multi-task SAC with FiLM encoders or train Multi-headed SAC with gated mixture of encoders with or without using task encoders.

Refer to the `tutorial <>`_ for more details.

References

CBLR18

Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. Gradnorm: gradient normalization for adaptive loss balancing in deep multitask networks. In International Conference on Machine Learning, 794–803. PMLR, 2018.

GKB+19

Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, and Marc G Bellemare. Deepmdp: learning continuous latent space models for representation learning. In International Conference on Machine Learning, 2170–2179. PMLR, 2019.

PSDV+18

Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. Film: visual reasoning with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32. 2018.

TBC+17

Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, and Razvan Pascanu. Distral: robust multitask reinforcement learning. arXiv preprint arXiv:1707.04175, 2017.

YXWW20(1,2)

Ruihan Yang, Huazhe Xu, Yi Wu, and Xiaolong Wang. Multi-task reinforcement learning with soft modularization. arXiv preprint arXiv:2003.13661, 2020.

YKG+20

Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. Gradient surgery for multi-task learning. arXiv preprint arXiv:2001.06782, 2020.

YQH+20

Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Karol Hausman, Chelsea Finn, and Sergey Levine. Meta-world: a benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on Robot Learning, 1094–1100. PMLR, 2020.

ZSKP20

Amy Zhang, Shagun Sodhani, Khimya Khetarpal, and Joelle Pineau. Multi-task reinforcement learning as a hidden-parameter block mdp. arXiv preprint arXiv:2007.07206, 2020.