See Through the Noise: Improving Domain Generalization in Gaze Estimation

arXiv cs.CV / 4/21/2026

📰 NewsModels & Research

Key Points

  • The paper studies how label noise from difficult-to-obtain gaze annotations can harm domain generalization in gaze estimation models.
  • It proposes the See-Through-Noise (SeeTN) framework to mitigate label noise by building a prototype-based semantic embedding space that preserves topology between gaze features and continuous labels.
  • SeeTN uses feature–label affinity consistency to separate noisy samples from clean ones and applies affinity regularization on the semantic manifold to transfer gaze information from clean to noisy data.
  • Experiments show SeeTN improves cross-domain generalization under source-domain label noise while maintaining source-domain accuracy, emphasizing that noise should be explicitly handled in generalized gaze estimation.
  • Overall, the work connects noise robustness with domain-invariant gaze relationships enforced through semantic structure alignment.

Abstract

Generalizable gaze estimation methods have garnered increasing attention due to their critical importance in real-world applications and have achieved significant progress. However, they often overlook the effect of label noise, arising from the inherent difficulty of acquiring precise gaze annotations, on model generalization performance. In this paper, we are the first to comprehensively investigate the negative effects of label noise on generalization in gaze estimation. Further, we propose a novel solution, called See-Through-Noise (SeeTN) framework, which improves generalization from a novel perspective of mitigating label noise. Specifically, we propose to construct a semantic embedding space via a prototype-based transformation to preserve a consistent topological structure between gaze features and continuous labels. We then measure feature-label affinity consistency to distinguish noisy from clean samples, and introduce a novel affinity regularization in the semantic manifold to transfer gaze-related information from clean to noisy samples. Our proposed SeeTN promotes semantic structure alignment and enforces domain-invariant gaze relationships, thereby enhancing robustness against label noise. Extensive experiments demonstrate that our SeeTN effectively mitigates the adverse impact of source-domain noise, leading to superior cross-domain generalization without compromising the source-domain accuracy, and highlight the importance of explicitly handling noise in generalized gaze estimation.