GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning
arXiv cs.CL / 4/10/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes GRASS, a memory-efficient full-parameter fine-tuning framework that improves on layer-wise importance sampling by making it adaptive to both tasks and training stages.
- GRASS estimates layer importance using mean gradient norms, enabling sampling decisions that reflect how different layers matter at different points in training.
- It further adapts layer sampling probabilities during training, aiming to preserve or improve downstream performance relative to prior static layer importance approaches.
- The method includes a layer-wise optimizer state offloading technique that overlaps computation and communication to reduce GPU memory usage without significantly hurting training throughput.
- Experiments across multiple models and benchmarks show GRASS consistently outperforms existing state-of-the-art methods, with reported average accuracy gains up to 4.38 points and memory reductions up to 19.97%.
Related Articles

Black Hat Asia
AI Business
v0.20.5
Ollama Releases

Inside Anthropic's Project Glasswing: The AI Model That Found Zero-Days in Every Major OS
Dev.to
Gemma 4 26B fabricated an entire code audit. I have the forensic evidence from the database.
Reddit r/LocalLLaMA

SoloEngine: Low-Code Agentic AI Development Platform with Native Support for Multi-Agent Collaboration, MCP, and Skill System
Dev.to