RDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language Models

arXiv cs.LG / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that although parameter-efficient fine-tuning methods like LoRA reduce training cost, it remains unclear which specific layers should be adapted because the roles of internal representations are not well understood.
It models hidden-state evolution as a high-dimensional geometric trajectory and applies the Ramer-Douglas-Peucker (RDP) algorithm to find “breakpoints” that preserve major structural transitions while removing locally redundant changes.
The identified geometric pivots are used directly as a decision signal to select which layers to adapt, rather than only for post-hoc analysis.
When integrated into LoRA fine-tuning for Qwen3-8B-Base, adapting only 13 RDP-selected layers (81.67% on MMLU-Math) outperforms full 36-layer adaptation (79.32%), random 13-layer selection (75.56%), and the unadapted baseline (74.25%).
Overall, the work claims that intrinsic geometry of representation trajectories provides a robust, interpretable, and training-free method for improving layer selection in parameter-efficient adaptation.

Abstract

Fine-tuning Large Language Models (LLMs) remains structurally uncertain despite parameter-efficient methods such as Low-Rank Adaptation (LoRA), as the layer-specific roles of internal representations are poorly understood, leading to heuristic decisions about where adaptation should be applied. We model the evolution of hidden states as a high-dimensional geometric trajectory and propose using the Ramer-Douglas-Peucker (RDP) algorithm, a parameter-free and training-free polygon simplification method that preserves global structural transitions while eliminating locally redundant changes, to identify critical breakpoints along the representation path. Crucially, we use these geometric pivots not merely for analysis, but as a direct decision signal for determining which layers should be adapted during parameter-efficient fine-tuning. By integrating this geometry-aware layer selection strategy into LoRA fine-tuning of Qwen3-8B-Base, we achieve superior performance on MMLU-Math using only 13 RDP-selected layers (81.67%), significantly outperforming both full 36-layer adaptation (79.32%) and random 13-layer selection (75.56%), as well as the baseline Qwen3-8B-Base model (74.25%). These results demonstrate that leveraging the intrinsic geometry of representation trajectories provides a robust, interpretable, and training-free signal for optimizing layer selection during model adaptation.