PairConnect: A Compute-Efficient MLP Alternative to Attention
Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis
Extraction de sΓ©quences numΓ©riques dans des documents manuscrits quelconques
How far can we go without convolution: Improving fully-connected networks
Deep Neural Networks for Large Vocabulary Handwritten Text Recognition
Do Deep Convolutional Nets Really Need to be Deep and Convolutional?
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
β face#sussman-attains-enlightenment
The Shattered Gradients Problem: If resnets are the answer, then what is the question?
NFNet: High-Performance Large-Scale Image Recognition Without Normalization
Fixup Initialization: Residual Learning Without Normalization
Improving Transformer Optimization Through Better Initialization
ZerO Initialization: Initializing Residual Networks with only Zeros and Ones
Understanding the Covariance Structure of Convolutional Filters
The Goldilocks zone: Towards better understanding of neural network loss landscapes
NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations
SwitchNet: a neural network model for forward and inverse scattering problems
Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias
Adapting the Function Approximation Architecture in Online Reinforcement Learning
Data-driven emergence of convolutional structure in neural networks
Noise Transforms Feed-Forward Networks into Sparse Coding Networks
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP
Gesticulator: A framework for semantically-aware speech-driven gesture generation
RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis
S2-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet
ResMLP: Feedforward networks for image classification with data-efficient training
MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?
Sparse-MLP: A Fully-MLP Architecture with Conditional Computation
Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
MLP Architectures for Vision-and-Language Modeling: An Empirical Study
MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs
Synthesizer: Rethinking Self-Attention in Transformer Models
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks (EAMLP)
MoGlow: Probabilistic and controllable motion synthesis using normalizing flows
A Style-Based Generator Architecture for Generative Adversarial Networks
Image Generators with Conditionally-Independent Pixel Synthesis
Fourier Neural Operator for Parametric Partial Differential Equations
SIREN: Implicit Neural Representations with Periodic Activation Functions
Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
MixerGAN: An MLP-Based Architecture for Unpaired Image-to-Image Translation