Catastrophic forgetting remains a persistent challenge when performing sequential or multi-task fine-tuning on LLMs. Models often lose significant capability on previous tasks or general knowledge as they adapt to new domains (medical, legal, code, etc.).
This seems rooted in the fundamental way gradient-based optimization works and new updates overwrite earlier representations without any explicit separation between fast learning and long-term consolidation.
Common mitigations like (LoRA, replay buffers, EWC, etc.) provide some relief but come with their own scalability, cost and efficiency trade-offs.
We've been exploring a dual-memory architecture inspired by complementary learning systems in neuroscience (fast episodic memory + slower semantic consolidation). Early experiments on standard continual learning benchmarks show strong retention (~98% on sequential splits) while maintaining competitive accuracy, compared to basic standard gradient baselines that drop near zero on retention.
Here's a quick 5-test snapshot (learned encoder):
| Test | Metric | Our approach | Gradient baseline | Gap |
|---|---|---|---|---|
| #1 Continual (10 seeds) | Retention | 0.980 ± 0.005 | 0.006 ± 0.006 | +0.974 |
| #2 Few-shot k=1 | Accuracy | 0.593 | 0.264 | +0.329 |
| #3 Novelty detection | AUROC | 0.898 | 0.793 | +0.105 |
| #5 Long-horizon recall | Recall at N=5000 | 1.000 | 0.125 | 8× |
Still early-stage research with plenty of limitations (e.g., weaker on pure feature transfer tasks).
Questions for the community: What approaches have shown the most promise for continual learning in LLMs beyond replay/regularization? Is architectural separation of memory (vs. training tricks) a viable direction and how much of a bottleneck is catastrophic forgetting for practical multi-task LLM work today?
Looking forward to thoughts on this.
[link] [comments]
