Bibliography (10):

  1. Attention Is All You Need

  2. S4: Efficiently Modeling Long Sequences with Structured State Spaces

  3. Mamba: Linear-Time Sequence Modeling with Selective State Spaces

  4. Attention as an RNN

  5. QRNNs: Quasi-Recurrent Neural Networks

  6. xLSTM: Extended Long Short-Term Memory

  7. GRU: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation