From Seeing to Simulating: Generative High-Fidelity Simulation with Digital Cousins for Generalizable Robot Learning and Evaluation

arXiv cs.RO / 4/20/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

共有:

Key Points

The paper proposes a generative framework that maps real-world panoramic scenes into high-fidelity simulation environments to reduce the need for expensive real-data collection.
It generates diverse “digital cousin” scenes through semantic and geometric editing, leveraging realistic assets and high-quality physics engines to support interactive robot manipulation tasks.
The approach also uses multi-room stitching to build consistent large-scale environments for long-horizon navigation in complex layouts.
Experiments report strong sim-to-real correlation, and show that scaling up synthetic scene generation improves generalization to unseen variations in scenes and objects for robot learning and evaluation.

Abstract

Learning robust robot policies in real-world environments requires diverse data augmentation, yet scaling real-world data collection is costly due to the need for acquiring physical assets and reconfiguring environments. Therefore, augmenting real-world scenes into simulation has become a practical augmentation for efficient learning and evaluation. We present a generative framework that establishes a generative real-to-sim mapping from real-world panoramas to high-fidelity simulation scenes, and further synthesize diverse cousin scenes via semantic and geometric editing. Combined with high-quality physics engines and realistic assets, the generated scenes support interactive manipulation tasks. Additionally, we incorporate multi-room stitching to construct consistent large-scale environments for long-horizon navigation across complex layouts. Experiments demonstrate a strong sim-to-real correlation validating our platform's fidelity, and show that extensively scaling up data generation leads to significantly better generalization to unseen scene and object variations, demonstrating the effectiveness of Digital Cousins for generalizable robot learning and evaluation.

Which Version of Qwen 3.6 for M5 Pro 24g

Reddit r/LocalLLaMA

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)

Dev.to

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI

Dev.to

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else

Dev.to

Local LLM Beginner’s Guide (Mac - Apple Silicon)

Reddit r/artificial

From Seeing to Simulating: Generative High-Fidelity Simulation with Digital Cousins for Generalizable Robot Learning and Evaluation

Key Points

Abstract

Related Articles

Which Version of Qwen 3.6 for M5 Pro 24g

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else

Local LLM Beginner’s Guide (Mac - Apple Silicon)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer