Bibliography (4):
Mistral-7B
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Language Models are Unsupervised Multitask Learners
https://github.com/kddubey/pretrain-on-test/