Bibliography (22):

https://github.com/ThomasScialom/T0_continual_learning
https://huggingface.co/ThomasNLG/CT0-11B
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Continual Learning with Foundation Models: An Empirical Study of Latent Replay
Don’t Stop Learning: Towards Continual Learning for the CLIP Model
Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
‘instruct-tuning LLMs’ directory
T0: Multitask Prompted Training Enables Zero-Shot Task Generalization
A Call to Build Models Like We Build Open-Source Software
https://arxiv.org/pdf/2205.12393.pdf#page=5
https://arxiv.org/pdf/2205.12393.pdf#page=7
https://arxiv.org/pdf/2205.12393.pdf#page=4
https://arxiv.org/pdf/2205.12393.pdf#page=8
Effect of scale on catastrophic forgetting in neural networks
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Continual Pre-Training Mitigates Forgetting in Language and Vision
An Empirical Investigation of the Role of Pre-training in Lifelong Learning
https://arxiv.org/pdf/2205.12393.pdf#page=15
Attention Is All You Need
Wikipedia Bibliography:
1. https://en.wikipedia.org/wiki/Catastrophic_interference :
  
  https://en.wikipedia.org/wiki/Catastrophic_interference
2. Neural scaling law :
  
  https://en.wikipedia.org/wiki/Neural_scaling_law