“LLaMa-1: Open and Efficient Foundation Language Models”, 2023-02-27 ():
We introduce LLaMa, a collection of foundation language models ranging from 7B to 65b parameters.
We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets.
In particular, LLaMa-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMa-65B is competitive with the best models, Chinchilla-70B and PaLM-540B.
We release all our models to the research community [which were then leaked].