AI Navigate

Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image Retrieval

arXiv cs.CV / 3/16/2026

📰 NewsModels & Research

Key Points

  • The paper tackles unsupervised cross-domain image retrieval and identifies the limitations of relying on discrete pseudo-labels and entangled domain-semantic information.
  • It proposes TPSNet, which uses CLIP-generated domain prompts as a text prior to provide more precise semantic supervision across domains.
  • It introduces a domain-invariant phase feature as a phase prior that bridges domain distribution gaps while preserving semantic integrity.
  • The combination of text priors and phase priors yields significant improvements over state-of-the-art methods on unsupervised cross-domain image retrieval benchmarks.

Abstract

This paper studies unsupervised cross-domain image retrieval (UCDIR), which aims to retrieve images of the same category across different domains without relying on labeled data. Existing methods typically utilize pseudo-labels, derived from clustering algorithms, as supervisory signals for intra-domain representation learning and cross-domain feature alignment. However, these discrete pseudo-labels often fail to provide accurate and comprehensive semantic guidance. Moreover, the alignment process frequently overlooks the entanglement between domain-specific and semantic information, leading to semantic degradation in the learned representations and ultimately impairing retrieval performance. This paper addresses the limitations by proposing a Text-Phase Synergy Network with Dual Priors(TPSNet). Specifically, we first employ CLIP to generate a set of class-specific prompts per domain, termed as domain prompt, serving as a text prior that offers more precise semantic supervision. In parallel, we further introduce a phase prior, represented by domain-invariant phase features, which is integrated into the original image representations to bridge the domain distribution gaps while preserving semantic integrity. Leveraging the synergy of these dual priors, TPSNet significantly outperforms state-of-the-art methods on UCDIR benchmarks.