End-to-End Shared Attention Estimation via Group Detection with Feedback Refinement

arXiv cs.CV / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces an end-to-end method that jointly detects the focusing group and estimates shared attention (SA) in images, addressing limitations of prior SA approaches that assume a single SA point or ignore real group membership.
  • It uses a two-step pipeline: first generating SA heatmaps from individual gaze attention heatmaps plus estimated group membership scalars, and then refining group memberships based on the initial SA heatmaps.
  • The refinement step is designed to improve consistency between detected group membership and the resulting SA heatmap, producing a more accurate final SA prediction.
  • Experiments report that the proposed approach outperforms existing methods on both group detection and SA estimation, and ablation/analysis supports the contribution of each component.
  • The authors provide code via the linked GitHub repository, enabling reproduction and further experimentation.

Abstract

This paper proposes an end-to-end shared attention estimation method via group detection. Most previous methods estimate shared attention (SA) without detecting the actual group of people focusing on it, or assume that there is a single SA point in a given image. These issues limit the applicability of SA detection in practice and impact performance. To address them, we propose to simultaneously achieve group detection and shared attention estimation using a two step process: (i) the generation of SA heatmaps relying on individual gaze attention heatmaps and group membership scalars estimated in a group inference; (ii) a refinement of the initial group memberships allowing to account for the initial SA heatmaps, and the final prediction of the SA heatmap. Experiments demonstrate that our method outperforms other methods in group detection and shared attention estimation. Additional analyses validate the effectiveness of the proposed components. Code: https://github.com/chihina/sagd-CVPRW2026.