CO-EVO: Co-evolving Semantic Anchoring and Style Diversification for Federated DG-ReID

arXiv cs.LG / 4/30/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper introduces CO-EVO, a federated domain generalization framework for person re-identification (FedDG-ReID) that preserves raw data privacy while improving generalization to unseen target environments.
  • It addresses stylistic gaps across decentralized clients by combining semantic purification (Camera-Invariant Semantic Anchoring, CSA) to learn cross-camera consistent identity prompts.
  • On the visual side, it proposes Global Style Diversification (GSD) using a Global Camera-Style Bank (GCSB) to synthesize realistic perturbations that broaden the training style range.
  • CO-EVO’s co-evolutionary loop uses purified semantic anchors to steer the image encoder toward robust anatomical (identity-related) attributes despite diverse style variations.
  • Experiments reportedly achieve state-of-the-art performance, and the authors release code for replication and further research.

Abstract

Federated domain generalization for person re-identification (FedDG-ReID) aims to collaboratively train a pedestrian retrieval model across multiple decentralized source domains such that it can generalize to unseen target environments without compromising raw data privacy. However, this task is significantly challenged by the inherent stylistic gaps across decentralized clients. Without global supervision, models easily succumb to shortcut learning where representations overfit to domain specific camera biases rather than universal identity features. We propose CO-EVO, a novel federated framework that resolves this semantic-style conflict through a co-evolutionary mechanism. On the semantic side, Camera-Invariant Semantic Anchoring (CSA) learns identity prompts with cross-camera consistency to establish purified and domain-agnostic anchors that filter out local imaging noise. On the visual side, Global Style Diversification (GSD), powered by a Global Camera-Style Bank (GCSB), synthesizes realistic perturbations to expand the visual boundaries of training data. The core of CO-EVO is its co-evolutionary loop where purified anchors act as gravitational centers to guide the image encoder toward robust anatomical attributes amidst diverse style variations. Extensive experiments demonstrate that CO-EVO achieves state-of-the-art (SOTA) performance, proving that the synergy between semantic purification and style expansion is essential for robust cross-domain generalization. Our code is available at: https://github.com/NanYiyuzurn/ACL-LGPS-2026.