Scalable Variational Bayesian Fine-Tuning of LLMs via Orthogonalized Low-Rank Adapters

arXiv cs.LG / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper targets uncertainty quantification (UQ) for LLMs in safety-critical settings, focusing on the overconfidence that often arises after parameter-efficient fine-tuning with limited data.
  • It argues that existing calibration approaches—such as Laplace-based post-hoc methods and variational Bayesian training requiring Monte Carlo passes through the full backbone—are either suboptimal or not scalable for deployment.
  • To improve both expressiveness and stable adaptation, it introduces PoLAR (Polar-decomposed Low-rank Adapter Representation), which orthogonalizes LoRA-style adapters and uses Riemannian optimization to mitigate rank collapse.
  • It then combines PoLAR with a Bayesian last-layer (BLL) and variational inference to form PoLAR-VBLL, using alternating optimization to jointly learn adapter parameters and an approximate posterior for uncertainty reasoning.
  • Experiments reportedly show improved generalization and better-calibrated uncertainty estimates on both in-distribution and out-of-distribution common-sense reasoning tasks.

Abstract

When deploying large language models (LLMs) to safety-critical applications, uncertainty quantification (UQ) is of utmost importance to self-assess the reliability of the LLM-based decisions. However, such decisions typically suffer from overconfidence, particularly after parameter-efficient fine-tuning (PEFT) for downstream domain-specific tasks with limited data. Existing methods to alleviate this issue either rely on Laplace approximation based post-hoc framework, which may yield suboptimal calibration depending on the training trajectory, or variational Bayesian training that requires multiple complete forward passes through the entire LLM backbone at inference time for Monte Carlo estimation, posing scalability challenges for deployment. To address these limitations, we build on the Bayesian last layer (BLL) model, where the LLM-based deterministic feature extractor is followed by random last layer parameters for uncertainty reasoning. Since existing low-rank adapters (LoRA) for PEFT have limited expressiveness due to rank collapse, we address this with Polar-decomposed Low-rank Adapter Representation (PoLAR), an orthogonalized parameterization paired with Riemannian optimization to enable more stable and expressive adaptation. Building on this PoLAR-BLL model, we leverage the variational (V) inference framework to put forth a scalable Bayesian fine-tuning approach which jointly seeks the PoLAR parameters and approximate posterior of the last layer parameters via alternating optimization. The resulting PoLAR-VBLL is a flexible framework that nicely integrates architecture-enhanced optimization with scalable Bayesian inference to endow LLMs with well-calibrated UQ. Our empirical results verify the effectiveness of PoLAR-VBLL in terms of generalization and uncertainty estimation on both in-distribution and out-of-distribution data for various common-sense reasoning tasks.