Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity

arXiv cs.LG / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies how to fine-tune mixture-of-experts (MoE) LLMs with federated learning (FL) under non-IID, privacy-sensitive data, where client heterogeneity breaks standard parameter aggregation.
It identifies two key FL-specific aggregation failures: inconsistent gating preferences that yield a “one-size-fits-none” global router, and mismatched semantic roles of same-index experts that blur specialization.
To fix this, it proposes FedAlign-MoE, which aligns routing behavior across clients using routing/distribution consistency weighting and regularization for more stable global gating.
It also introduces semantic consistency measurement for same-index experts and selectively aggregates updates only from semantically aligned clients to preserve expert specialization.
Experiments reportedly show FedAlign-MoE improves over prior methods with faster convergence and higher accuracy in non-IID federated settings.

Abstract

Large language models (LLMs) increasingly adopt Mixture-of-Experts (MoE) architectures to scale model capacity while reducing computation. Fine-tuning these MoE-based LLMs often requires access to distributed and privacy-sensitive data, making centralized fine-tuning impractical. Federated learning (FL) therefore provides a paradigm to collaboratively fine-tune MoE-based LLMs, enabling each client to integrate diverse knowledge without compromising data privacy. However, the integration of MoE-based LLM fine-tuning into FL encounters two critical aggregation challenges due to inherent data heterogeneity across clients: (i) divergent local data distributions drive clients to develop distinct gating preference for localized expert selection, causing direct parameter aggregation to produce a ``one-size-fits-none'' global gating network, and (ii) same-indexed experts develop disparate semantic roles across clients, leading to expert semantic blurring and the degradation of expert specialization. To address these challenges, we propose FedAlign-MoE, a federated aggregation alignment framework that jointly enforces routing consistency and expert semantic alignment. Specifically, FedAlign-MoE aggregates gating behaviors by aligning routing distributions through consistency weighting and optimizes local gating networks through distribution regularization, maintaining cross-client stability without overriding discriminative local preferences. Meanwhile, FedAlign-MoE explicitly quantifies semantic consistency among same-indexed experts across clients and selectively aggregates updates from semantically aligned clients, ensuring stable and specialized functional roles for global experts. Extensive experiments demonstrate that FedAlign-MoE outperforms state-of-the-art benchmarks, achieving faster convergence and superior accuracy in non-IID federated environments.

GDPR and AI Training Data: What You Need to Know Before Training on Personal Data

Dev.to

Edge-to-Cloud Swarm Coordination for heritage language revitalization programs with embodied agent feedback loops

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

AI Crawler Management: The Definitive Guide to robots.txt for AI Bots

Dev.to

Data Sovereignty Rules and Enterprise AI

Dev.to

Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity

Key Points

Abstract

Related Articles

GDPR and AI Training Data: What You Need to Know Before Training on Personal Data

Edge-to-Cloud Swarm Coordination for heritage language revitalization programs with embodied agent feedback loops

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

AI Crawler Management: The Definitive Guide to robots.txt for AI Bots

Data Sovereignty Rules and Enterprise AI

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer