OnlinePG: Online Open-Vocabulary Panoptic Mapping with 3D Gaussian Splatting

arXiv cs.CV / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

OnlinePG introduces an online open-vocabulary panoptic mapping system that combines geometric reconstruction and open-vocabulary perception using 3D Gaussian Splatting.
The method uses a sliding window local-to-global paradigm to maintain real-time performance and builds a local 3D segment clustering graph that fuses inconsistent segments into complete instances.
Global map updates are performed by representing local 3D Gaussian maps in explicit spatial grids and merging them with a robust bidirectional bipartite 3D Gaussian instance matching.
The approach leverages Vision-Language Model features within 3D spatial grids to enable open-vocabulary scene understanding.
Extensive experiments on standard datasets show state-of-the-art or competitive performance among online methods while preserving real-time efficiency.

Abstract

Open-vocabulary scene understanding with online panoptic mapping is essential for embodied applications to perceive and interact with environments. However, existing methods are predominantly offline or lack instance-level understanding, limiting their applicability to real-world robotic tasks. In this paper, we propose OnlinePG, a novel and effective system that integrates geometric reconstruction and open-vocabulary perception using 3D Gaussian Splatting in an online setting. Technically, to achieve online panoptic mapping, we employ an efficient local-to-global paradigm with a sliding window. To build local consistency map, we construct a 3D segment clustering graph that jointly leverages geometric and semantic cues, fusing inconsistent segments within sliding window into complete instances. Subsequently, to update the global map, we construct explicit grids with spatial attributes for the local 3D Gaussian map and fuse them into the global map via robust bidirectional bipartite 3D Gaussian instance matching. Finally, we utilize the fused VLM features inside the 3D spatial attribute grids to achieve open-vocabulary scene understanding. Extensive experiments on widely used datasets demonstrate that our method achieves better performance among online approaches, while maintaining real-time efficiency.

Attacks On Data Centers, Qwen3.5 In All Sizes, DeepSeek’s Huawei Play, Apple’s Multimodal Tokenizer

The Batch

Your AI generated code is "almost right", and that is actually WORSE than it being "wrong".

Dev.to

Lessons from Academic Plagiarism Tools for SaaS Product Development

Dev.to

Core Allocation Optimization for Energy‑Efficient Multi‑Core Scheduling in ARINC650 Systems

Dev.to

KI in der amtlichen Recherche beim DPMA: Was Patentanwälte bei Neuanmeldungen jetzt beachten sollten (Stand: März 2026)

Dev.to

OnlinePG: Online Open-Vocabulary Panoptic Mapping with 3D Gaussian Splatting

Key Points

Abstract

Related Articles

Attacks On Data Centers, Qwen3.5 In All Sizes, DeepSeek’s Huawei Play, Apple’s Multimodal Tokenizer

Your AI generated code is "almost right", and that is actually WORSE than it being "wrong".

Lessons from Academic Plagiarism Tools for SaaS Product Development

Core Allocation Optimization for Energy‑Efficient Multi‑Core Scheduling in ARINC650 Systems

KI in der amtlichen Recherche beim DPMA: Was Patentanwälte bei Neuanmeldungen jetzt beachten sollten (Stand: März 2026)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Abstract

Related Articles

Attacks On Data Centers, Qwen3.5 In All Sizes, DeepSeek’s Huawei Play, Apple’s Multimodal Tokenizer

Your AI generated code is "almost right", and that is actually WORSE than it being "wrong".

Lessons from Academic Plagiarism Tools for SaaS Product Development

**Core Allocation Optimization for Energy‑Efficient Multi‑Core Scheduling in ARINC650 Systems**

KI in der amtlichen Recherche beim DPMA: Was Patentanwälte bei Neuanmeldungen jetzt beachten sollten (Stand: März 2026)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Core Allocation Optimization for Energy‑Efficient Multi‑Core Scheduling in ARINC650 Systems