Skip to main content

‘self-attention’ tag

See Also

Gwern

“Absolute Unit NNs: Regression-Based MLPs for Everything”, Gwern 2023

Absolute Unit NNs: Regression-Based MLPs for Everything

“Research Ideas”, Gwern 2017

Research Ideas

“GPT-3 Creative Fiction”, Gwern 2020

GPT-3 Creative Fiction

“Efficient Attention: Breaking The Quadratic Transformer Bottleneck”, Gwern 2020

Efficient Attention: Breaking The Quadratic Transformer Bottleneck

Miscellaneous