Bibliography (8):

  1. OPT: Open Pre-trained Transformer Language Models

  2. Bigscience/bloom

  3. GLM-130B: An Open Bilingual Pre-trained Model

  4. https://github.com/NVIDIA/FasterTransformer

  5. https://github.com/mit-han-lab/smoothquant