Low-Rank Adaptation Reduces Catastrophic Forgetting in Sequential Transformer Encoder Fine-Tuning: Controlled Empirical Evidence and Frozen-Backbone Representation Probes

arXiv cs.LG / 3/31/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents a controlled empirical study of Low-Rank Adaptation (LoRA) for sequential fine-tuning of pretrained transformer encoders, focusing on whether it reduces catastrophic forgetting versus full fine-tuning.
  • In five reruns on a BERT-base sequence (RTE→MRPC→CoLA→SST-2), full fine-tuning shows about 19.9%±4.8% average forgetting, while standard LoRA (r=8 on query/value modules) reduces forgetting to about 0.6%±1.4% with statistically significant improvement.
  • Task-level analyses and secondary experiments on RoBERTa-base confirm that LoRA’s reduced forgetting is not just an aggregate artifact, outperforming the strongest Elastic Weight Consolidation (EWC) baseline (≈15.5%±1.4% forgetting).
  • A six-task extension demonstrates that low average forgetting can mask substantial task-level heterogeneity, highlighting the need for more granular evaluation in continual learning settings.
  • Freezing and representation-probe ablations indicate a mechanistic account: forgetting drops notably once frozen parameters exceed ~95%, and probes suggest backbone freezing preserves a more stable shared feature scaffold, with full fine-tuning diverging most clearly at the final transformer layer.

Abstract

Sequential fine-tuning of pretrained language encoders often overwrites previously acquired capabilities, but the forgetting behavior of parameter-efficient updates remains under-characterized. We present a controlled empirical study of Low-Rank Adaptation (LoRA) in sequential transformer encoder fine-tuning with companion representation probes that test a frozen-backbone explanation of its robustness. In five full-validation BERT-base reruns on an RTE->MRPC->CoLA->SST-2 sequence, full fine-tuning yields 19.9%+/-4.8% average forgetting, whereas standard LoRA (r=8, query/value modules) yields 0.6%+/-1.4% (paired t-test, p=0.002, Cohen's d_s=3.12). Task-level analyses confirm this reduction is not merely an aggregate effect. Secondary experiments on RoBERTa-base show the same pattern, and the strongest EWC baseline remains at 15.5%+/-1.4% forgetting. A six-task extension reveals that low average forgetting can hide strong task-level heterogeneity. Fine-grained freezing ablations show a marked forgetting drop once frozen parameters exceed roughly 95%, with classifier-only and shallow-adapter baselines approaching LoRA. Companion task-similarity probes in GPT-2 and RoBERTa show the same directional story: frozen-backbone regimes preserve higher inter-task similarity than full fine-tuning, gradual unfreezing weakens stability, and full fine-tuning exhibits its clearest divergence at the final transformer layer. These results support a restrained mechanistic interpretation: LoRA helps largely because backbone freezing preserves a more stable shared feature scaffold. We position standard LoRA as both a strong empirical baseline for sequential encoder adaptation and a useful probe of how selective plasticity shapes interference in transformer continual learning.