FLASH: Fast Learning via GPU-Accelerated Simulation for High-Fidelity Deformable Manipulation in Minutes

arXiv cs.RO / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper introduces FLASH, a GPU-native simulation framework aimed at accelerating contact-rich deformable object manipulation, a major bottleneck for current soft-material robotics learning.
  • FLASH is built around an accurate NCP-based solver with carefully enforced contact and deformation constraints, redesigned specifically for fine-grained GPU parallelism rather than merely porting existing GPU-less SIMD-style solvers.
  • The system reportedly scales to more than 3 million degrees of freedom while running at 30 FPS on a single RTX 5090, maintaining physical interaction accuracy.
  • Policies trained only on FLASH-generated synthetic data in minutes are claimed to achieve robust zero-shot sim-to-real transfer on real robots, including tasks like towel folding and garment folding without any real-world demonstrations.

Abstract

Simulation frameworks such as Isaac Sim have enabled scalable robot learning for locomotion and rigid-body manipulation; however, contact-rich simulation remains a major bottleneck for deformable object manipulation. The continuously changing geometry of soft materials, together with large numbers of vertices and contact constraints, makes it difficult to achieve high accuracy, speed, and stability required for large-scale interactive learning. We present FLASH, a GPU-native simulation framework for contact-rich deformable manipulation, built on an accurate NCP-based solver that enforces strict contact and deformation constraints while being explicitly designed for fine-grained GPU parallelism. Rather than porting conventional single-instruction-multiple-data (SIMD) solvers to GPUs, FLASH redesigns the physics engine from the ground up to leverage modern GPU architectures, including optimized collision handling and memory layouts. As a result, FLASH scales to over 3 million degrees of freedom at 30 FPS on a single RTX 5090, while accurately simulating physical interactions. Policies trained solely on FLASH-generated synthetic data in minutes achieve robust zero-shot sim-to-real transfer, which we validate on physical robots performing challenging deformable manipulation tasks such as towel folding and garment folding, without any real-world demonstration, providing a practical alternative to labor-intensive real-world data collection.