Bibliography (3):

  1. MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism

  2. T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

  3. Turing-NLG: A 17-billion-parameter language model by Microsoft