SGAP-Gaze: Scene Grid Attention Based Point-of-Gaze Estimation Network for Driver Gaze
arXiv cs.CV / 4/23/2026
📰 NewsSignals & Early TrendsModels & Research
Key Points
- The article presents SGAP-Gaze, a driver point-of-gaze (PoG) estimation network that improves gaze prediction by explicitly incorporating traffic-scene context alongside facial cues.
- It introduces a benchmark dataset, Urban Driving-Face Scene Gaze (UD-FSG), which provides synchronized driver-face and traffic-scene images to support scene-aware gaze learning and evaluation.
- SGAP-Gaze fuses facial modalities (face, eye, iris) into a gaze-intent vector, then uses a Transformer-based attention mechanism over a spatial scene grid to produce the PoG.
- Experimental results show mean pixel error of 104.73 on UD-FSG and 63.48 on the LBW dataset, representing a 23.5% reduction versus state-of-the-art driver gaze estimation methods.
- Spatial distribution analysis indicates SGAP-Gaze maintains lower errors than existing approaches even in outer scene regions, which are typically rare but important for assessing driver attention in real driving.
Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Dev.to

10 AI Tools Every Developer Should Try in 2026
Dev.to

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago
Dev.to

GPT Image 2 Subject-Lock Editing: A Practical Guide to input_fidelity
Dev.to