Bibliography (10):

  1. Scalable and Efficient MoE Training for Multitask Multilingual Models

  2. GODEL: Large-Scale Pre-Training for Goal-Directed Dialog

  3. https://www.microsoft.com/en-us/research/project/project-zcode/

  4. https://arxiv.org/abs/2109.08668

  5. Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization

  6. https://arxiv.org/abs/2005.02365