Bibliography (6):

  1. MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism

  2. 2021-junseong-hyperclova.html

  3. GPT-3: Language Models are Few-Shot Learners