Bibliography (9):

  1. GPT-3: Language Models are Few-Shot Learners

  2. mT5: A massively multilingual pre-trained text-to-text transformer

  3. XGLM: Few-shot Learning with Multilingual Language Models

  4. Unsupervised Neural Machine Translation with Generative Language Models Only

  5. What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?

  6. UL2: Unifying Language Learning Paradigms

  7. AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

  8. https://arxiv.org/pdf/2209.14500.pdf#page=4

  9. Wikipedia Bibliography:

    1. Autoregressive model