BoostLoRA: Growing Effective Rank by Boosting Adapters

arXiv cs.LG / 5/1/2026

📰 NewsModels & Research

共有:

Key Points

BoostLoRA addresses a key limitation of ultra-low-parameter PEFT by enabling model expressivity to grow beyond a fixed low-rank subspace cap.
It uses an iterative gradient-boosting approach that trains minimal adapters only on examples the current model mispredicts, then merges them while discarding adapters afterward to avoid inference overhead.
A ROTATE SVD basis strategy assigns each training round to an orthogonal subspace, making the cumulative effective rank increase linearly with the number of rounds.
Experiments on Qwen2.5-3B show strong gains over TinyLoRA and full fine-tuning on GSM8K, MATH-500, MBPP, and HumanEval, with full fine-tuning performing worse on code generation.
The method also demonstrates cross-architecture transfer on protein binding classification using ESM2-650M, suggesting broader applicability of the training/merging strategy.

Abstract

Parameter-efficient fine-tuning (PEFT) methods face a tradeoff between adapter size and expressivity: ultra-low-parameter adapters are confined to fixed low-rank subspaces, capping performance even with extended training. We propose BoostLoRA, a gradient-boosting framework that overcomes this limit by iteratively training and merging minimal adapters on the examples the current model gets wrong. A ROTATE SVD basis strategy assigns each round to an orthogonal subspace, so cumulative effective rank grows linearly with the number of rounds while each adapter remains ultra-low-rank. After merging, adapters are discarded, leaving zero inference overhead. On Qwen2.5-3B, BoostLoRA reaches 89.1% on GSM8K and 68.8% on MATH-500, surpassing both the best single-shot ultra-low parameter adapter (TinyLoRA) and full fine-tuning; on code generation it reaches 57.2% on MBPP and 80.4% on HumanEval while full fine-tuning drops below the zero-shot baseline. We also demonstrate cross-architecture transfer on protein binding classification with ESM2-650M and cross-entropy training. BoostLoRA is, to our knowledge, the first PEFT method whose effective rank grows with training, separating per-round parameter cost from total representational capacity.

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Dev.to

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model

THE DECODER

Qualcomm teases ‘dedicated CPU for agentic experiences’ and ‘agentic smartphones’

The Register

Finetuning Dataset: Claude Opus 4.6/4.7 - 8.7k Chats

Reddit r/LocalLLaMA

Phosphene local video and audio generation for Apple Silicon open source (LTX 2.3) [P]

Reddit r/MachineLearning

BoostLoRA: Growing Effective Rank by Boosting Adapters

Key Points

Abstract

Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model

Qualcomm teases ‘dedicated CPU for agentic experiences’ and ‘agentic smartphones’

Finetuning Dataset: Claude Opus 4.6/4.7 - 8.7k Chats

Phosphene local video and audio generation for Apple Silicon open source (LTX 2.3) [P]

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer