-
Beyond English-Centric Multilingual Machine Translation
-
https://github.com/facebookresearch/fairseq/tree/main/examples/m2m_100
-
CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
-
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
-
https://ai.meta.com/blog/laser-multilingual-sentence-embeddings/
-
Wikipedia Bibliography:
-
BLEU