Rapidly deploying on-device eye tracking by distilling visual foundation models
arXiv cs.CV / 4/6/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes DistillGaze, a framework for rapidly deploying accurate on-device gaze estimation for AR/VR despite device-to-device hardware and illumination changes.
- It addresses a key limitation of off-the-shelf visual foundation models on specialized near-eye infrared imagery by creating a domain-specialized teacher with self-supervised learning using both labeled synthetic and unlabeled real data.
- DistillGaze then trains a lightweight on-device student model via teacher guidance plus self-training to close the synthetic-to-real domain gap.
- On a large crowd-sourced dataset with 2,000+ participants, DistillGaze cuts median gaze error by 58.62% versus synthetic-only baselines while keeping the model small (256K parameters) for real-time deployment.
Related Articles

Black Hat Asia
AI Business

How Bash Command Safety Analysis Works in AI Systems
Dev.to

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide
Dev.to

How to Get Better Output from AI Tools (Without Burning Time and Tokens)
Dev.to

How I Added LangChain4j Without Letting It Take Over My Spring Boot App
Dev.to