BridgeACT: Bridging Human Demonstrations to Robot Actions via Unified Tool-Target Affordances
arXiv cs.RO / 4/28/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- BridgeACT is a new affordance-driven framework for learning robot manipulation directly from human videos, avoiding the need for any robot demonstration data.
- The approach uses embodiment-agnostic intermediate affordance representations to bridge human demonstrations and executable robot actions.
- It decomposes manipulation into two parts—identifying where to grasp and predicting how to move—by grounding task-relevant affordance regions and then predicting task-conditioned 3D motion affordances.
- BridgeACT maps learned affordances to real robot behaviors via a grasping module and a lightweight closed-loop motion controller, supporting direct real-robot deployment.
- Experiments on real-world tasks indicate improved performance over prior baselines and strong generalization to unseen objects, scenes, and viewpoints.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles
How I Automate My Dev Workflow with Claude Code Hooks
Dev.to

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System
Dev.to

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)
Dev.to

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹
Dev.to

Real-Time Monitoring for AI Agents: Beyond Log Streaming
Dev.to