Bibliography (3):

  1. MAE: Masked Autoencoders Are Scalable Vision Learners

  2. Attention Is All You Need

  3. https://github.com/facebookresearch/AudioMAE