https://danielpovey.com/files/2015_icassp_librispeech.pdf
Libri-Light: A Benchmark for ASR with Limited or No Supervision
Conformer: Convolution-augmented Transformer for Speech Recognition
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations