Flexible task abstractions emerge in linear networks with fast and bounded units
Investigating learning-independent abstract reasoning in artificial neural networks
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Online Adaptation of Language Models with a Memory of Amortized Contexts (MAC)
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Investigating Continual Pretraining in Large Language Models: Insights and Implications
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
In-Context Pretraining (ICP): Language Modeling Beyond Document Boundaries
Loss of Plasticity in Deep Continual Learning (Continual Backpropagation)
Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA
The Forward-Forward Algorithm: Some Preliminary Investigations
Exclusive Supermask Subnetwork Training for Continual Learning
Learn the Time to Learn: Replay Scheduling in Continual Learning
On the Effectiveness of Compact Biomedical Transformers (✱BioBERT)
Don’t Stop Learning: Towards Continual Learning for the CLIP Model
Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision
Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Continual Pre-Training Mitigates Forgetting in Language and Vision
Continual Learning with Foundation Models: An Empirical Study of Latent Replay
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
Effect of scale on catastrophic forgetting in neural networks
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
An Empirical Investigation of the Role of Pre-training in Lifelong Learning
The Geometry of Representational Drift in Natural and Artificial Neural Networks
Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora
Continuous Coordination As a Realistic Scenario for Lifelong Learning
Inductive Biases for Deep Learning of Higher-Level Cognition
Learning from the Past: Meta-Continual Learning with Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition
Meta-Learning through Hebbian Plasticity in Random Networks
Understanding the Role of Training Regimes in Continual Learning
Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks
Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning
Unicorn: Continual Learning with a Universal, Off-policy Agent
PathNet: Evolution Channels Gradient Descent in Super Neural Networks
Repeat Before Forgetting: Spaced Repetition for Efficient and Effective Training of Neural Networks
2024-ibrahim-figure1-continualpretrainingwithcyclicallearningratematchesfromscratchtraining.png
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
https%253A%252F%252Farxiv.org%252Fabs%252F2401.08406%2523microsoft.html
Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision
https%253A%252F%252Farxiv.org%252Fabs%252F2110.11526%2523deepmind.html
Wikipedia Bibliography: