https://x.com/NolanoOrg/status/1634027966651834370
GPT-3: Language Models are Few-Shot Learners
https://github.com/NolanoOrg/llama-int4-quant/
https://github.com/qwopqwop200/GPTQ-for-LLaMa
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
FLAN: Scaling Instruction-Finetuned Language Models
EleutherAI/gpt-Neo: An Implementation of Model Parallel GPT-2 and GPT-3-Style Models Using the Mesh-Tensorflow Library.