Bibliography (3):
MAE: Masked Autoencoders Are Scalable Vision Learners
Attention Is All You Need
https://github.com/facebookresearch/AudioMAE