Bibliography:

  1. Language Models are Unsupervised Multitask Learners

  2. ConnorJL/GPT2: An Implementation of Training for GPT-2, Supports TPUs

  3. https://github.com/ConnorJL/GPT2/tree/master/samples

  4. A domain-specific supercomputer for training deep neural networks

  5. Wikipedia Bibliography:

    1. OpenAI