Is MLP-Mixer a CNN in Disguise? As Part of This Blog Post, We Look at the MLP Mixer Architecture in Detail and Also Understand Why It Is Not Considered Convolution Free.
Vision Transformer: An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale
MLP-Mixer: An all-MLP Architecture for Vision
Attention Is All You Need