Compressive Transformers for Long-Range Sequence Modeling
A domain-specific supercomputer for training deep neural networks
SCaNN: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
Generating Sequences With Recurrent Neural Networks
Improving Neural Language Models with a Continuous Cache