{\Psi}-Map: Panoptic Surface Integrated Mapping Enables Real2Sim Transfer

arXiv cs.RO / 4/14/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces {Ψ}-Map, a framework for open-vocabulary panoptic reconstruction aimed at enabling more reliable real-to-sim transfer for robotics perception and simulation.
It uses LiDAR-driven plane-constrained multimodal Gaussian Mixture Models and 2D Gaussian surfels to improve surface alignment and provide continuous geometric supervision in large-scale environments.
To reduce error accumulation from multi-stage pipelines, it proposes a query-guided end-to-end architecture that lifts 2D mask features into 3D via local cross-attention within the view frustum for globally consistent panoptic understanding.
It improves real-time performance by optimizing rendering with Precise Tile Intersection and a Top-K hard selection strategy for semantic computation.
Experiments report better geometric and panoptic reconstruction quality while sustaining over 40 FPS inference, targeting robotic control loop real-time needs.

Abstract

Open-vocabulary panoptic reconstruction is essential for advanced robotics perception and simulation. However, existing methods based on 3D Gaussian Splatting (3DGS) often struggle to simultaneously achieve geometric accuracy, coherent panoptic understanding, and real-time inference frequency in large-scale scenes. In this paper, we propose a comprehensive framework that integrates geometric reinforcement, end-to-end panoptic learning, and efficient rendering. First, to ensure physical realism in large-scale environments, we leverage LiDAR data to construct plane-constrained multimodal Gaussian Mixture Models (GMMs) and employ 2D Gaussian surfels as the map representation, enabling high-precision surface alignment and continuous geometric supervision. Building upon this, to overcome the error accumulation and cumbersome cross-frame association inherent in traditional multi-stage panoptic segmentation pipelines, we design a query-guided end-to-end learning architecture. By utilizing a local cross-attention mechanism within the view frustum, the system lifts 2D mask features directly into 3D space, achieving globally consistent panoptic understanding. Finally, addressing the computational bottlenecks caused by high-dimensional semantic features, we introduce Precise Tile Intersection and a Top-K Hard Selection strategy to optimize the rendering pipeline. Experimental results demonstrate that our system achieves superior geometric and panoptic reconstruction quality in large-scale scenes while maintaining an inference rate exceeding 40 FPS, meeting the real-time requirements of robotic control loops.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/14DailyView insight →

Black Hat Asia

AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Don't forget, there is more than forgetting: new metrics for Continual Learning

Dev.to

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale

Dev.to

Bit of a strange question?

Reddit r/artificial

{\Psi}-Map: Panoptic Surface Integrated Mapping Enables Real2Sim Transfer

Key Points

Abstract

💡 Insights using this article

Related Articles

Black Hat Asia

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Don't forget, there is more than forgetting: new metrics for Continual Learning

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale

Bit of a strange question?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer