AffordSim: A Scalable Data Generator and Benchmark for Affordance-Aware Robotic Manipulation

arXiv cs.RO / 4/14/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

共有:

Key Points

AffordSim is a new simulation framework that generates manipulation trajectories using object affordance information, enabling semantically correct interactions like handle grasping, precise pouring, and mug hanging.
It integrates open-vocabulary 3D affordance prediction via the authors’ VoxAfford model to produce affordance maps on object point clouds and uses these maps to guide grasp pose estimation toward task-relevant functional regions.
AffordSim is implemented on NVIDIA Isaac Sim with cross-embodiment support across robots (e.g., Franka FR3, Panda, UR5e, Kinova), VLM-powered task generation, and domain randomization driven by DA3-style 3D Gaussian reconstruction from real photos.
The paper introduces a benchmark of 50 tasks across 7 categories and evaluates imitation learning baselines (BC, Diffusion Policy, ACT, Pi 0.5), finding that affordance-heavy tasks (pouring, mug hanging) remain much less successful than grasping.
Zero-shot sim-to-real experiments on a real Franka FR3 suggest the affordance-aware generated data transfers effectively beyond simulation.

Abstract

Simulation-based data generation has become a dominant paradigm for training robotic manipulation policies, yet existing platforms do not incorporate object affordance information into trajectory generation. As a result, tasks requiring precise interaction with specific functional regions--grasping a mug by its handle, pouring from a cup's rim, or hanging a mug on a hook--cannot be automatically generated with semantically correct trajectories. We introduce AffordSim, the first simulation framework that integrates open-vocabulary 3D affordance prediction into the manipulation data generation pipeline. AffordSim uses our VoxAfford model, an open-vocabulary 3D affordance detector that enhances MLLM output tokens with multi-scale geometric features, to predict affordance maps on object point clouds, guiding grasp pose estimation toward task-relevant functional regions. Built on NVIDIA Isaac Sim with cross-embodiment support (Franka FR3, Panda, UR5e, Kinova), VLM-powered task generation, and novel domain randomization using DA3-based 3D Gaussian reconstruction from real photographs, AffordSim enables automated, scalable generation of affordance-aware manipulation data. We establish a benchmark of 50 tasks across 7 categories (grasping, placing, stacking, pushing/pulling, pouring, mug hanging, long-horizon composite) and evaluate 4 imitation learning baselines (BC, Diffusion Policy, ACT, Pi 0.5). Our results reveal that while grasping is largely solved (53-93% success), affordance-demanding tasks such as pouring into narrow containers (1-43%) and mug hanging (0-47%) remain significantly more challenging for current imitation learning methods, highlighting the need for affordance-aware data generation. Zero-shot sim-to-real experiments on a real Franka FR3 validate the transferability of the generated data.

Black Hat Asia

AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Don't forget, there is more than forgetting: new metrics for Continual Learning

Dev.to

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale

Dev.to

One URL for Your AI Agent: HTML, JSON, Markdown, and an A2A Card

Dev.to

AffordSim: A Scalable Data Generator and Benchmark for Affordance-Aware Robotic Manipulation

Key Points

Abstract

Related Articles

Black Hat Asia

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Don't forget, there is more than forgetting: new metrics for Continual Learning

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale

One URL for Your AI Agent: HTML, JSON, Markdown, and an A2A Card

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer