Bibliography (3):

  1. Language Models are Unsupervised Multitask Learners

  2. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

  3. BERTese: Learning to Speak to BERT