AI Navigate

Continual Fine-Tuning with Provably Accurate and Parameter-Free Task Retrieval

arXiv cs.LG / 3/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses continual fine-tuning to adapt a pre-trained backbone to new tasks sequentially while preserving performance on earlier tasks whose data are no longer available.
  • It explains the limitations of existing input- and parameter-adaptation methods: retrieval forgetting and reduced representation adaptability.
  • It proposes a parameter-adaptation method enabling adaptive use of input embeddings at test time with parameter-free retrieval.
  • It derives task-retrieval error bounds for a clustering-based paradigm, linking low retrieval error to the structure of task-specific representation clusters.
  • It introduces two components—an adaptive module composition strategy for task-specific updates and a clustering-based retrieval mechanism—and shows via extensive experiments that they improve retrieval and predictive performance under large shifts in task semantics.

Abstract

Continual fine-tuning aims to adapt a pre-trained backbone to new tasks sequentially while preserving performance on earlier tasks whose data are no longer available. Existing approaches fall into two categories which include input- and parameter-adaptation. Input-adaptation methods rely on retrieving the most relevant prompts at test time, but require continuously learning a retrieval function that is prone to forgetting. Parameter-adaptation methods instead use a fixed input embedding function to enable retrieval-free prediction and avoid forgetting, but sacrifice representation adaptability. To combine their best strengths, we propose a new parameter-adaptation method that enables adaptive use of input embeddings during test time with parameter-free retrieval. We derive task-retrieval error bounds for a clustering-based, parameter-free paradigm, providing theoretical guarantees that link low retrieval error to structural properties of task-specific representation clusters, revealing a fresh insight into how well-organized clustering structure will enable reliable retrieval. Motivated by this insight, our method is designed with two key components: (i) an adaptive module composition strategy that learns informative task-specific updates to preserve and complement prior knowledge, and (ii) a clustering-based retrieval mechanism that captures distinct representation signatures for each task, enabling adaptive representation use at test time. Extensive experiments show that these components work synergistically to improve retrieval and predictive performance under large shifts in task semantics.