DiM: Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Attention Is All You Need
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Gated Delta Networks: Improving Mamba-2 with Delta Rule
https://test-time-training.github.io/video-dit