“TinyLlama: An Open-Source Small Language Model”, Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu2024-01-04 ()⁠:

We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for ~3 epochs.

Building on the architecture and tokenizer of LLaMA-2, TinyLlama leverages various advances contributed by the open-source community (eg. FlashAttention), achieving better computational efficiency.

Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It outperforms existing open-source language models with comparable sizes.

Our model checkpoints and code are publicly available on GitHub at Github.