Real-time Appearance-based Gaze Estimation for Open Domains
arXiv cs.CV / 3/31/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper identifies a major generalization gap in appearance-based gaze estimation when applied to unconstrained real-world scenarios such as facial wearables and poor lighting.
- It attributes the gap to two main issues: insufficient image diversity during training and inconsistent label fidelity across datasets, especially along the pitch axis.
- The authors propose a robust framework that improves generalization without additional human annotation by using an augmented image-manifold ensemble (e.g., synthetic eyeglasses/masks and lighting variation) and multi-task learning.
- The multi-task formulation combines discretized gaze classification, multi-view supervised contrastive (SupCon) learning, and eye-region segmentation to reduce anisotropic inter-dataset label deviation.
- They introduce new benchmark datasets focused on robustness in challenging conditions and report that a lightweight MobileNet-based model enables high-fidelity, real-time gaze tracking on mobile with fewer than 1% of the parameters of UniGaze-H.
Related Articles

Black Hat Asia
AI Business

Unitree's IPO
ChinaTalk

Did you know your GIGABYTE laptop has a built-in AI coding assistant? Meet GiMATE Coder 🤖
Dev.to

Benchmarking Batch Deep Reinforcement Learning Algorithms
Dev.to
A bug in Bun may have been the root cause of the Claude Code source code leak.
Reddit r/LocalLLaMA