Aletheia: Gradient-Guided Layer Selection for Efficient LoRA Fine-Tuning Across Architectures
arXiv cs.LG / 4/20/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- The paper introduces Aletheia, a gradient-guided method that selects the most task-relevant transformer layers for LoRA rather than applying adapters uniformly across all layers.
- Aletheia uses a lightweight gradient probe to identify relevant layers and performs LoRA with asymmetric rank allocation only on those selected layers.
- Across 81 experiment rows spanning 14 successful model variants from 8 architecture families (0.5B–72B parameters, including dense and Mixture-of-Experts), Aletheia delivers a mean 23.1% training speedup with 15–28% gains.
- The approach shows bounded extra forgetting and broadly matched downstream results on MMLU, GSM8K, and HumanEval, with reported preservation of behavior in a second campaign that included one failed attempt (Pythia/GPT-NeoX).
- Overall, the results support a practical “model economics” claim that intelligent layer selection can make LoRA fine-tuning significantly more efficient while causing limited degradation on the evaluated benchmarks.



