Pushing the Limits of Distillation-Based Continual Learning via Classifier-Proximal Lightweight Plugins
arXiv stat.ML / 4/6/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper tackles distillation-based continual learning, focusing on the stability-plasticity dilemma that limits how well coupled distillation and learning objectives can preserve old knowledge while learning new data.
- It introduces Distillation-aware Lightweight Components (DLC), a plugin-based extension that inserts lightweight residual plugins into the classifier-proximal layer to apply semantic-level corrections without heavily disturbing the base feature extractor.
- For inference, DLC aggregates plugin-enhanced representations to form predictions, and it adds a lightweight weighting unit to down-rank non-target plugin representations and reduce interference.
- Experiments report about an 8% accuracy improvement on large-scale benchmarks with only a 4% increase in backbone parameters, indicating high parameter- and disruption-efficiency.
- The approach is designed to be compatible with other plug-and-play continual learning enhancements and can provide additional gains when combined with them.




