Gau-Occ: Geometry-Completed Gaussians for Multi-Modal 3D Occupancy Prediction

arXiv cs.CV / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Gau-Occ は、自動運転向けの 3D セマンティック占有予測を、シーンを「コンパクトなセマンティック3Dガウスの集合」として表現することで、密なボクセル/BEV 処理を回避しつつ高精度化するマルチモーダル枠組みです。
LiDAR が疎な状況で欠落構造を補うために LiDAR Completion Diffuser (LCD) を提案し、ロバストな Gaussian anchor の初期化を実現します。
マルチビュー画像の意味情報は、幾何整合された 2D サンプリングとクロスモーダル整合（Gaussian Anchor Fusion: GAF）により効率よく統合され、空間整合と意味識別性の両立を狙います。
実験では複数の難度が高いベンチマークで最先端性能を示し、同時に計算効率でも大きな利点があると報告されています。

Abstract

3D semantic occupancy prediction is crucial for autonomous driving. While multi-modal fusion improves accuracy over vision-only methods, it typically relies on computationally expensive dense voxel or BEV tensors. We present Gau-Occ, a multi-modal framework that bypasses dense volumetric processing by modeling the scene as a compact collection of semantic 3D Gaussians. To ensure geometric completeness, we propose a LiDAR Completion Diffuser (LCD) that recovers missing structures from sparse LiDAR to initialize robust Gaussian anchors. Furthermore, we introduce Gaussian Anchor Fusion (GAF), which efficiently integrates multi-view image semantics via geometry-aligned 2D sampling and cross-modal alignment. By refining these compact Gaussian descriptors, Gau-Occ captures both spatial consistency and semantic discriminability. Extensive experiments across challenging benchmarks demonstrate that Gau-Occ achieves state-of-the-art performance with significant computational efficiency.

Santa Augmentcode Intent Ep.6

Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Dev.to

ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’

Reddit r/artificial

Gau-Occ: Geometry-Completed Gaussians for Multi-Modal 3D Occupancy Prediction

Key Points

Abstract

Related Articles

Santa Augmentcode Intent Ep.6

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer