FedSQ: Optimized Weight Averaging via Fixed Gating
arXiv cs.LG / 4/6/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- FedSQ is a federated learning approach designed to address instability in naive weight averaging caused by client data heterogeneity (non-i.i.d. splits) during federated fine-tuning.
- The method leverages the observation that ReLU-like (piecewise-linear) gating regimes can stabilize training by treating model structure (“structural knowledge”) separately from remaining parameters (“quantitative knowledge”).
- FedSQ uses a DualCopy setup where a frozen copy of a pretrained backbone induces fixed binary gating masks, while only a quantitative copy is trained locally and aggregated across federated rounds.
- By fixing the gating masks, FedSQ restricts learning to within-regime affine refinements, improving the stability of aggregation under heterogeneous client partitions.
- Experiments on two CNN backbones across i.i.d. and Dirichlet data splits show improved robustness and potentially fewer rounds to reach best validation performance while maintaining accuracy in transfer-initialized settings.
Related Articles

Black Hat Asia
AI Business

How Bash Command Safety Analysis Works in AI Systems
Dev.to

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide
Dev.to

How to Get Better Output from AI Tools (Without Burning Time and Tokens)
Dev.to

How I Added LangChain4j Without Letting It Take Over My Spring Boot App
Dev.to