Bibliography (8):

SMYRF: Efficient Attention using Asymmetric Clustering
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
A domain-specific supercomputer for training deep neural networks
Large Scale GAN Training for High Fidelity Natural Image Synthesis
A Style-Based Generator Architecture for Generative Adversarial Networks
Wikipedia Bibliography:
1. Locality-sensitive hashing
2. Generative adversarial network