Bibliography (3):

  1. https://github.com/databricks/megablocks

  2. ‘end-to-end’ directory

  3. MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism