“FLEURS: Few-Shot Learning Evaluation of Universal Representations of Speech”, Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna2022-05-25 (, )⁠:

We introduce FLEURS, the Few-shot Learning Evaluation of Universal Representations of Speech benchmark. FLEURS is an n-way parallel speech dataset in 102 languages built on top of the machine translation FLoRes-101 benchmark, with ~12 hours of speech supervision per language. FLEURS can be used for a variety of speech tasks, including Automatic Speech Recognition (ASR), Speech Language Identification (Speech LangID), Translation and Retrieval.

In this paper, we provide baselines for the tasks based on multilingual pre-trained models like mSLAM.

The goal of FLEURS is to enable speech technology in more languages and catalyze research in low-resource speech understanding.