Bibliography (3):

  1. Machine Learning Scaling

  2. T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

  3. The Pile: An 800GB Dataset of Diverse Text for Language Modeling