Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression
arXiv cs.LG / 5/1/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper proposes Auto-FlexSwitch, a dynamic model merging approach aimed at multi-task adaptation while avoiding the high storage overhead of storing separate parameters per task.
- It builds on the observation that fine-tuned weight increments (“task vectors”) show an impulse-like activation pattern and remain robust under low-bit quantization.
- The method compresses task vectors by decomposing them into compact components (binary sparse mask, sign vector, and scalar scaling), enabling high-fidelity approximations at high compression ratios.
- It introduces training-free and learnable schemes (Auto-Switch and FlexSwitch) that assemble and compress task vectors adaptively using feature similarity retrieval, learnable gating sparsification, bit-width adaptive selection, and sparsity-aware storage.
- Auto-FlexSwitch further enhances dynamic merging by using a KNN inference procedure with a learnable low-rank metric to select and apply compressed task vectors at inference time.
Related Articles
Every handle invocation on BizNode gets a WFID — a universal transaction reference for accountability. Full audit trail,...
Dev.to
I deployed AI agents across AWS, GCP, and Azure without a VPN. Here is how it works.
Dev.to
Panduan Lengkap TestSprite MCP Server — Dokumentasi Getting Started dalam Bahasa Indonesia
Dev.to
Every Telegram conversation becomes a qualified lead. BizNode captures name, email, and business details automatically while...
Dev.to
MCP, Skills, AI Agents, and New Models: The New Stack for Software Development
Dev.to