Representation Finetuning for Continual Learning

arXiv cs.AI / 3/13/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

CoRe (Continual Representation Learning) shifts fine-tuning from weight space to a representation-space approach to improve continual learning.
It constrains updates to a low-rank subspace of hidden representations, achieving parameter efficiency while preserving past task stability and future-task plasticity.
Unlike many PEFT methods, CoRe uses explicit objectives for representation updates to reduce sensitivity to domain shifts and catastrophic forgetting.
Experimental results across multiple continual learning benchmarks show CoRe outperforms state-of-the-art methods, introducing representation finetuning as a new, interpretable paradigm.

Abstract

The world is inherently dynamic, and continual learning aims to enable models to adapt to ever-evolving data streams. While pre-trained models have shown powerful performance in continual learning, they still require finetuning to adapt effectively to downstream tasks. However, prevailing Parameter-Efficient Fine-Tuning (PEFT) methods operate through empirical, black-box optimization at the weight level. These approaches lack explicit control over representation drift, leading to sensitivity to domain shifts and catastrophic forgetting in continual learning scenarios. In this work, we introduce Continual Representation Learning (CoRe), a novel framework that for the first time shifts the finetuning paradigm from weight space to representation space. Unlike conventional methods, CoRe performs task-specific interventions within a low-rank linear subspace of hidden representations, adopting a learning process with explicit objectives, which ensures stability for past tasks while maintaining plasticity for new ones. By constraining updates to a low-rank subspace, CoRe achieves exceptional parameter efficiency. Extensive experiments across multiple continual learning benchmarks demonstrate that CoRe not only preserves parameter efficiency but also significantly outperforms existing state-of-the-art methods. Our work introduces representation finetuning as a new, more effective and interpretable paradigm for continual learning.