Meta-Learning at Scale for Large Language Models via Low-Rank Amortized Bayesian Meta-Learning

arXiv stat.ML / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes Amortized Bayesian Meta-Learning for LoRA (ABMLL) to fine-tune large language models across multiple datasets with low-rank adaptations while improving cross-dataset generalization in few-shot settings.
ABMLL reframes the roles of local vs. global variables within the LoRA parameterization and introduces a new hyperparameter that trades off reconstruction accuracy against how closely task-specific parameters remain faithful to global parameters.
Experiments on large models such as Llama3-8B and Qwen2-7B show ABMLL outperforming existing approaches on CrossFit and Unified-QA, improving both accuracy and expected calibration error.
The authors also demonstrate that meta-learning can be combined with in-context learning to further boost performance on the same benchmarks and on legal and chemistry application tasks.

Abstract

Fine-tuning large language models (LLMs) with low-rank adaptation (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, when a problem requires incorporating information from multiple datasets - as in few shot learning - generalization across datasets can be limited, driving up training costs. As a consequence, other approaches such as in-context learning are typically used in this setting. To address this challenge, we introduce an efficient method for adapting the weights of LLMs to multiple distributions, Amortized Bayesian Meta-Learning for LoRA (ABMLL). This method builds on amortized Bayesian meta-learning for smaller models, adapting this approach to LLMs by reframing where local and global variables are defined in LoRA and using a new hyperparameter to balance reconstruction accuracy and the fidelity of task-specific parameters to the global ones. ABMLL supports effective generalization across datasets and scales to large models such as Llama3-8B and Qwen2-7B, outperforming existing methods on the CrossFit and Unified-QA datasets in terms of both accuracy and expected calibration error. We show that meta-learning can also be combined with in-context learning, resulting in further improvements in both these datasets and legal and chemistry applications.

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story

Dev.to

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure

Dev.to

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts

MarkTechPost

The house asked me a question

Dev.to

Precision Clip Selection: How AI Suggests Your In and Out Points

Dev.to

Meta-Learning at Scale for Large Language Models via Low-Rank Amortized Bayesian Meta-Learning

Key Points

Abstract

Related Articles

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts

The house asked me a question

Precision Clip Selection: How AI Suggests Your In and Out Points

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer