Show, Don't Tell: Detecting Novel Objects by Watching Human Videos
arXiv cs.CV / 3/16/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- The paper introduces 'Show, Don't Tell,' a self-supervised approach that trains bespoke object detectors directly from human demonstrations without relying on language descriptions.
- It automatically creates a training dataset from the demonstration and deploys an on-robot detector to recognize novel object instances seen during the task.
- The approach eliminates expensive language-based prompt engineering used by open-set detectors and outperforms state-of-the-art methods for detecting manipulated objects.
- The authors implement an integrated, real-world robotic system that deploys the paradigm to enable fast adaptation to unseen objects during demonstrations.
Related Articles

I let an AI agent loose on my codebase. It tried to read my .env file in 30 seconds.
Dev.to
How I Taught an AI Agent to Save Its Own Progress
Dev.to
OpenClaw vs Cryptohopper AI Studio: Why Local AI Wins on Privacy, Cost, and Control
Dev.to

Chip Smuggling Arrests, OpenClaw Is 'The Next ChatGPT,' and 81K People on AI
Dev.to
The Lemma
Dev.to