Bibliography (4):
https://gonzoml.substack.com/p/you-only-cache-once-decoder-decoder
Attention Is All You Need
https://github.com/microsoft/unilm/tree/master/YOCO
Wikipedia Bibliography:
Attention (machine learning)