AI Navigate

GAST: Gradient-aligned Sparse Tuning of Large Language Models with Data-layer Selection

arXiv cs.LG / 3/11/2026

Models & Research

Key Points

  • Parameter-Efficient Fine-Tuning (PEFT) is essential for adapting large language models efficiently, with recent sparse tuning methods reducing computational overhead by selective updates.
  • Existing methods focus on either layer-selective or data-selective tuning but typically ignore the varying contributions of different data points to individual model layers.
  • The proposed Gradient-aligned Sparse Tuning (GAST) method jointly optimizes data and layer selection to reduce redundancy by adaptively selecting impactful data points for each layer.
  • GAST integrates layer- and data-sparse strategies into a unified framework, outperforming baseline approaches and offering a more nuanced and effective PEFT solution.
  • Experimental results validate GAST's superior performance, highlighting its promise for advancing future parameter-efficient tuning research in large language models.

Computer Science > Machine Learning

arXiv:2603.09865 (cs)
[Submitted on 10 Mar 2026]

Title:GAST: Gradient-aligned Sparse Tuning of Large Language Models with Data-layer Selection

View a PDF of the paper titled GAST: Gradient-aligned Sparse Tuning of Large Language Models with Data-layer Selection, by Kai Yao and 7 other authors
View PDF HTML (experimental)
Abstract:Parameter-Efficient Fine-Tuning (PEFT) has become a key strategy for adapting large language models, with recent advances in sparse tuning reducing overhead by selectively updating key parameters or subsets of data. Existing approaches generally focus on two distinct paradigms: layer-selective methods aiming to fine-tune critical layers to minimize computational load, and data-selective methods aiming to select effective training subsets to boost training. However, current methods typically overlook the fact that different data points contribute varying degrees to distinct model layers, and they often discard potentially valuable information from data perceived as of low quality. To address these limitations, we propose Gradient-aligned Sparse Tuning (GAST), an innovative method that simultaneously performs selective fine-tuning at both data and layer dimensions as integral components of a unified optimization strategy. GAST specifically targets redundancy in information by employing a layer-sparse strategy that adaptively selects the most impactful data points for each layer, providing a more comprehensive and sophisticated solution than approaches restricted to a single dimension. Experiments demonstrate that GAST consistently outperforms baseline methods, establishing a promising direction for future research in PEFT strategies.
Subjects: Machine Learning (cs.LG)
Cite as: arXiv:2603.09865 [cs.LG]
  (or arXiv:2603.09865v1 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.2603.09865
Focus to learn more
arXiv-issued DOI via DataCite

Submission history

From: Penglei Gao [view email]
[v1] Tue, 10 Mar 2026 16:28:48 UTC (382 KB)
Full-text links:

Access Paper:

Current browse context:
cs.LG
< prev   |   next >
Change to browse by:
cs

References & Citations

export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo
Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
Links to Code Toggle
Papers with Code (What is Papers with Code?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos

Demos

Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers

Recommenders and Search Tools

Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
IArxiv recommender toggle
IArxiv Recommender (What is IArxiv?)
About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.