BoostLoRA: Growing Effective Rank by Boosting Adapters

arXiv cs.LG / 5/1/2026

📰 NewsModels & Research

Key Points

  • BoostLoRA addresses a key limitation of ultra-low-parameter PEFT by enabling model expressivity to grow beyond a fixed low-rank subspace cap.
  • It uses an iterative gradient-boosting approach that trains minimal adapters only on examples the current model mispredicts, then merges them while discarding adapters afterward to avoid inference overhead.
  • A ROTATE SVD basis strategy assigns each training round to an orthogonal subspace, making the cumulative effective rank increase linearly with the number of rounds.
  • Experiments on Qwen2.5-3B show strong gains over TinyLoRA and full fine-tuning on GSM8K, MATH-500, MBPP, and HumanEval, with full fine-tuning performing worse on code generation.
  • The method also demonstrates cross-architecture transfer on protein binding classification using ESM2-650M, suggesting broader applicability of the training/merging strategy.

Abstract

Parameter-efficient fine-tuning (PEFT) methods face a tradeoff between adapter size and expressivity: ultra-low-parameter adapters are confined to fixed low-rank subspaces, capping performance even with extended training. We propose BoostLoRA, a gradient-boosting framework that overcomes this limit by iteratively training and merging minimal adapters on the examples the current model gets wrong. A ROTATE SVD basis strategy assigns each round to an orthogonal subspace, so cumulative effective rank grows linearly with the number of rounds while each adapter remains ultra-low-rank. After merging, adapters are discarded, leaving zero inference overhead. On Qwen2.5-3B, BoostLoRA reaches 89.1% on GSM8K and 68.8% on MATH-500, surpassing both the best single-shot ultra-low parameter adapter (TinyLoRA) and full fine-tuning; on code generation it reaches 57.2% on MBPP and 80.4% on HumanEval while full fine-tuning drops below the zero-shot baseline. We also demonstrate cross-architecture transfer on protein binding classification with ESM2-650M and cross-entropy training. BoostLoRA is, to our knowledge, the first PEFT method whose effective rank grows with training, separating per-round parameter cost from total representational capacity.