Efficient Task Adaptation in Large Language Models via Selective Parameter Optimization

arXiv cs.CL / 4/21/2026

📰 NewsModels & Research

Key Points

  • The paper addresses a key issue with fine-tuning LLMs for domain-specific tasks: parameter updates can overwrite or “forget” general knowledge, reducing generalization and transferability.
  • It introduces a method to evaluate the importance of individual parameter elements by separating them into “core parameters” (critical for general language ability) and “non-core parameters” (more task-specific).
  • During fine-tuning, the approach keeps core parameters fixed and updates only non-core parameters, aiming to preserve pre-trained capabilities.
  • Experiments on scientific, medical, and physical tasks using GPT-J and LLaMA-3 indicate the method reduces catastrophic forgetting while improving task adaptability.
  • Overall, the work suggests a selective parameter optimization strategy that exploits heterogeneity in parameter sensitivity to general vs. domain tasks.

Abstract

Large Language Models (LLMs) have demonstrated excellent performance in general language understanding, generation and other tasks. However, when fine-tuning for specific domain tasks, the general knowledge accumulated in the pre-training phase is often partially overwritten or forgotten due to parameter updates, which severely limits the generalization ability and transferability of LLMs. Traditional fine-tuning strategies mostly train on the entire parameter space, ignoring the heterogeneity of model parameters, that is, some parameters are extremely important for general tasks, while other parameters are more sensitive to specific tasks. To alleviate the above problems, this paper innovatively proposes a parameter element importance evaluation method, which divides parameters into "core parameters" and "non-core parameters" by distinguishing the importance of parameters for general language ability tasks and specific domain tasks, and fixes the core parameters during fine-tuning, and only fine-tunes the non-core parameters. Extensive experiments on scientific, medical and physical tasks using GPT-J and LLaMA-3 show that our method can mitigate catastrophic forgetting while enhancing the adaptability of the model.