Sparse Spectral LoRA: Routed Experts for Medical VLMs

arXiv cs.CV / 4/3/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces MedQwen, a parameter-efficient medical vision-language model designed to improve robustness across heterogeneous medical datasets and sequential clinical tasks.
It combines a spectrally routed Mixture-of-Experts (MoE) with a theoretically grounded scaling rule that aligns low-rank (LoRA-style) expert updates with a fully fine-tuned MoE while keeping the base architecture unchanged.
Experts are initialized using non-overlapping SVD segments of pretrained weights, along with residual compensation and scaling to promote stable specialization and more consistent routing under distribution shift.
Experiments on 23 medical datasets (covering VQA, report generation, classification, and hallucination mitigation) show strong reliability, including performance near full fine-tuning for zero-shot classification with 339× fewer trainable parameters.
The method substantially reduces catastrophic forgetting in sequential training, achieving about ~5% forgetting versus >20–50% degradation for strong baselines.

Abstract

Large vision-language models (VLMs) excel on general benchmarks but often lack robustness in medical imaging, where heterogeneous supervision induces cross-dataset interference and sensitivity to data regime (i.e., how the supervisory signals are mixed). In realistic clinical workflows, data and tasks arrive sequentially, so naive continual training further leads to catastrophic forgetting. To address these challenges, we propose MedQwen, a parameter-efficient medical VLM that couples a spectrally routed Mixture-of-Experts (MoE) with a theoretically grounded scaling rule that aligns low-rank updates with a full-rank, fully fine-tuned MoE, without changing the base architecture. Concretely, we initialize each expert from non-overlapping singular value decomposition (SVD) segments of the pretrained weight and introduce a residual compensation and scaling scheme to enable stable expert specialization and consistent routing under distribution shift. Across 23 medical datasets covering visual question answering, report generation, radiology classification, and hallucination mitigation, MedQwen achieves strong, reliable performance: it approaches full fine-tuning on zero-shot classification with 339

\times

fewer trainable parameters, and reduces sequential forgetting to

\sim

5\% where strong baselines degrade by

>

20-50\%.

Black Hat Asia

AI Business

Mistral raises $830M, 9fin hits unicorn status, and new Tech.eu Summit speakers unveiled

Tech.eu

ChatGPT costs $20/month. I built an alternative for $2.99.

Dev.to

OpenAI shifts to usage-based pricing for Codex in ChatGPT business plans

THE DECODER

Why I built an AI assistant that doesn't know who you are

Dev.to

Sparse Spectral LoRA: Routed Experts for Medical VLMs

Key Points

Abstract

Related Articles

Black Hat Asia

Mistral raises $830M, 9fin hits unicorn status, and new Tech.eu Summit speakers unveiled

ChatGPT costs $20/month. I built an alternative for $2.99.

OpenAI shifts to usage-based pricing for Codex in ChatGPT business plans

Why I built an AI assistant that doesn't know who you are

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer