Bibliography (6):

  1. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

  2. RoBERTa: A Robustly Optimized BERT Pretraining Approach

  3. OPT: Open Pre-trained Transformer Language Models

  4. GPT-3: Language Models are Few-Shot Learners

  5. https://github.com/Yale-LILY/FOLIO