Show, Don't Tell: Detecting Novel Objects by Watching Human Videos
arXiv cs.CV / 3/16/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- The paper introduces 'Show, Don't Tell,' a self-supervised approach that trains bespoke object detectors directly from human demonstrations without relying on language descriptions.
- It automatically creates a training dataset from the demonstration and deploys an on-robot detector to recognize novel object instances seen during the task.
- The approach eliminates expensive language-based prompt engineering used by open-set detectors and outperforms state-of-the-art methods for detecting manipulated objects.
- The authors implement an integrated, real-world robotic system that deploys the paradigm to enable fast adaptation to unseen objects during demonstrations.




