Bibliography (8):

  1. Compressive Transformers for Long-Range Sequence Modeling

  2. A domain-specific supercomputer for training deep neural networks

  3. SCaNN: Accelerating Large-Scale Inference with Anisotropic Vector Quantization

  4. Generating Sequences With Recurrent Neural Networks

  5. Improving Neural Language Models with a Continuous Cache