Bibliography (4):

  1. Unsupervised Cross-lingual Representation Learning at Scale

  2. mT5: A massively multilingual pre-trained text-to-text transformer

  3. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter