Bibliography (3):

  1. GPT-3: Language Models are Few-Shot Learners

  2. FLAN: Finetuned Language Models Are Zero-Shot Learners

  3. mT5: A massively multilingual pre-trained text-to-text transformer