-
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
-
RoBERTa: A Robustly Optimized BERT Pretraining Approach
-
OPT: Open Pre-trained Transformer Language Models
-
GPT-3: Language Models are Few-Shot Learners
-
https://github.com/Yale-LILY/FOLIO
-