Bibliography (4):

Attention Is All You Need
A domain-specific supercomputer for training deep neural networks
https://research.google/blog/tensorstore-for-high-performance-scalable-array-storage/
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models