“LLaMa-1: Open and Efficient Foundation Language Models”, Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample2023-02-27 (, )⁠:

We introduce LLaMa, a collection of foundation language models ranging from 7B to 65b parameters.

We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets.

In particular, LLaMa-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMa-65B is competitive with the best models, Chinchilla-70B and PaLM-540B.

We release all our models to the research community [which were then leaked].