-
Attention Is All You Need
-
S4: Efficiently Modeling Long Sequences with Structured State Spaces
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
-
Attention as an RNN
-
QRNNs: Quasi-Recurrent Neural Networks
-
xLSTM: Extended Long Short-Term Memory
-
GRU: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
-