ActiveGlasses: Learning Manipulation with Active Vision from Ego-centric Human Demonstration
arXiv cs.RO / 4/10/2026
💬 OpinionSignals & Early TrendsModels & Research
Key Points
- The paper introduces ActiveGlasses, a robot learning system that captures manipulation and perception from ego-centric human demonstrations using active vision.
- It uses a stereo camera on smart glasses for both data collection and—by mounting the same camera on a 6-DoF perception arm—policy inference during zero-shot deployment.
- To support zero-shot transfer across platforms, the method extracts object trajectories from demonstrations and trains an object-centric point-cloud policy that jointly predicts manipulation actions and head movement.
- Experiments on multiple occlusion-heavy, precision interaction tasks show ActiveGlasses achieves zero-shot transfer, outperforms strong baselines with the same hardware, and generalizes across two different robot platforms.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.



