Bibliography (10):

Attention Is All You Need
S4: Efficiently Modeling Long Sequences with Structured State Spaces
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Attention as an RNN
QRNNs: Quasi-Recurrent Neural Networks
xLSTM: Extended Long Short-Term Memory
GRU: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Wikipedia Bibliography:
1. Recurrent neural network
2. Long short-term memory
3. Backpropagation through time