-
https://nn.labml.ai/transformers/retro/model.html
-
GPT-3: Language Models are Few-Shot Learners
-
https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf#ai21
-
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
-
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
-