cuRoboV2: Dynamics-Aware Motion Generation with Depth-Fused Distance Fields for High-DoF Robots

arXiv cs.RO / 4/17/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

cuRoboV2 is a unified motion-generation framework aimed at producing safe, physically feasible, and reactive robot trajectories, addressing gaps in existing planning, control, and solver methods for high-DoF robots.
The system introduces B-spline trajectory optimization that enforces smoothness and torque limits, improving feasibility for dynamics-aware execution.
It also proposes a GPU-native depth-fused TSDF/ESDF perception pipeline that generates dense signed distance fields across the full workspace (reportedly up to 10× faster and using 8× less memory), achieving up to 99% collision recall.
For high-DoF whole-body motion, cuRoboV2 adds scalable GPU-native computation including topology-aware kinematics, differentiable inverse dynamics, and a map-reduce self-collision approach, with up to 61× speedup and support for high-DoF humanoids.
Benchmarks show major gains over baselines, including 99.7% success with a 3kg payload, 99.6% collision-free IK on a 48-DoF humanoid, and better retargeting constraint satisfaction, alongside a redesigned codebase that enables LLM coding assistants to generate up to 73% of new modules.

Abstract

Effective robot autonomy requires motion generation that is safe, feasible, and reactive. Current methods are fragmented: fast planners output physically unexecutable trajectories, reactive controllers struggle with high-fidelity perception, and existing solvers fail on high-DoF systems. We present cuRoboV2, a unified framework with three key innovations: (1) B-spline trajectory optimization that enforces smoothness and torque limits; (2) a GPU-native TSDF/ESDF perception pipeline that generates dense signed distance fields covering the full workspace, unlike existing methods that only provide distances within sparsely allocated blocks, up to 10x faster and in 8x less memory than the state-of-the-art at manipulation scale, with up to 99% collision recall; and (3) scalable GPU-native whole-body computation, namely topology-aware kinematics, differentiable inverse dynamics, and map-reduce self-collision, that achieves up to 61x speedup while also extending to high-DoF humanoids (where previous GPU implementations fail). On benchmarks, cuRoboV2 achieves 99.7% success under 3kg payload (where baselines achieve only 72--77%), 99.6% collision-free IK on a 48-DoF humanoid (where prior methods fail entirely), and 89.5% retargeting constraint satisfaction (vs. 61% for PyRoki); these collision-free motions yield locomotion policies with 21% lower tracking error than PyRoki and 12x lower cross-seed variance than GMR. A ground-up codebase redesign for discoverability enabled LLM coding assistants to author up to 73% of new modules, including hand-optimized CUDA kernels, demonstrating that well-structured robotics code can unlock productive human-LLM collaboration. Together, these advances provide a unified, dynamics-aware motion generation stack that scales from single-arm manipulators to full humanoids. Code is available at https://github.com/NVlabs/curobo.