Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression

arXiv cs.LG / 5/1/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper proposes Auto-FlexSwitch, a dynamic model merging approach aimed at multi-task adaptation while avoiding the high storage overhead of storing separate parameters per task.
It builds on the observation that fine-tuned weight increments (“task vectors”) show an impulse-like activation pattern and remain robust under low-bit quantization.
The method compresses task vectors by decomposing them into compact components (binary sparse mask, sign vector, and scalar scaling), enabling high-fidelity approximations at high compression ratios.
It introduces training-free and learnable schemes (Auto-Switch and FlexSwitch) that assemble and compress task vectors adaptively using feature similarity retrieval, learnable gating sparsification, bit-width adaptive selection, and sparsity-aware storage.
Auto-FlexSwitch further enhances dynamic merging by using a KNN inference procedure with a learnable low-rank metric to select and apply compressed task vectors at inference time.

Abstract

Model merging has attracted attention as an effective path toward multi-task adaptation by integrating knowledge from multiple task-specific models. Among existing approaches, dynamic merging mitigates performance degradation caused by conflicting parameter updates across tasks by flexibly combining task-specific parameters at inference time, thereby maintaining high performance. However, these methods require storing independent parameters for each task, resulting in prohibitive storage overhead. To address this issue, we first experimentally demonstrate that the fine-tuned weight increments (referred to as task vectors) exhibit an impulse-like activation pattern and high robustness to low-bit representations. Driven by this insight, we propose T-Switch, which decomposes task vectors into three compact components: a binary sparse mask, a sign vector, and a scalar scaling factor, achieving high-fidelity approximation at high compression ratios. We then introduce Auto-Switch, a training-free merging scheme that automatically composes task vectors via feature similarity retrieval. Building on this, we develop Auto-Switch, a training-free merging scheme that automatically assembles task vectors through feature similarity retrieval. Furthermore, to transform task vector sparsification and quantization from static rules to adaptive learning, we propose FlexSwitch, a learnable framework which jointly optimizes the compression strategy for each model unit via Learnable Gating Sparsification (LGS) and Bit-width Adaptive Selection (BAS), while employing the Sparsity-Aware Storage Strategy (SASS) to select the optimal storage encoding structure. Finally, by incorporating a K-Nearest Neighbor (KNN) inference scheme with a learnable low-rank metric, we present Auto-FlexSwitch, a dynamic model merging approach that supports highly efficient task vector compression.

Every handle invocation on BizNode gets a WFID — a universal transaction reference for accountability. Full audit trail,...

Dev.to

I deployed AI agents across AWS, GCP, and Azure without a VPN. Here is how it works.

Dev.to

Panduan Lengkap TestSprite MCP Server — Dokumentasi Getting Started dalam Bahasa Indonesia

Dev.to

Every Telegram conversation becomes a qualified lead. BizNode captures name, email, and business details automatically while...

Dev.to

MCP, Skills, AI Agents, and New Models: The New Stack for Software Development

Dev.to

Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression

Key Points

Abstract

Related Articles

Every handle invocation on BizNode gets a WFID — a universal transaction reference for accountability. Full audit trail,...

I deployed AI agents across AWS, GCP, and Azure without a VPN. Here is how it works.

Panduan Lengkap TestSprite MCP Server — Dokumentasi Getting Started dalam Bahasa Indonesia

Every Telegram conversation becomes a qualified lead. BizNode captures name, email, and business details automatically while...

MCP, Skills, AI Agents, and New Models: The New Stack for Software Development

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer