End-to-End Shared Attention Estimation via Group Detection with Feedback Refinement
arXiv cs.CV / 4/3/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces an end-to-end method that jointly detects the focusing group and estimates shared attention (SA) in images, addressing limitations of prior SA approaches that assume a single SA point or ignore real group membership.
- It uses a two-step pipeline: first generating SA heatmaps from individual gaze attention heatmaps plus estimated group membership scalars, and then refining group memberships based on the initial SA heatmaps.
- The refinement step is designed to improve consistency between detected group membership and the resulting SA heatmap, producing a more accurate final SA prediction.
- Experiments report that the proposed approach outperforms existing methods on both group detection and SA estimation, and ablation/analysis supports the contribution of each component.
- The authors provide code via the linked GitHub repository, enabling reproduction and further experimentation.




