An Instance-Centric Panoptic Occupancy Prediction Benchmark for Autonomous Driving

arXiv cs.CV / 3/31/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The paper introduces a new instance-centric benchmark for 3D panoptic occupancy prediction, aiming to jointly predict voxel-wise semantics and instance identities in unified 3D scenes.
  • It addresses key dataset gaps by releasing ADMesh, an autonomous-driving-focused 3D mesh library with 15K+ high-quality models, diverse textures, and rich semantic annotations.
  • It also releases CarlaOcc, a physically consistent panoptic occupancy dataset with 100K+ CARLA-generated frames and instance-level voxel occupancy ground truth down to 0.05 m resolution.
  • The authors propose standardized evaluation metrics and run a benchmark of representative models to enable fair comparisons and reproducible research.
  • The resources (code and dataset) are made publicly available at the project link, supporting broader adoption for 3D panoptic perception research.

Abstract

Panoptic occupancy prediction aims to jointly infer voxel-wise semantics and instance identities within a unified 3D scene representation. Nevertheless, progress in this field remains constrained by the absence of high-quality 3D mesh resources, instance-level annotations, and physically consistent occupancy datasets. Existing benchmarks typically provide incomplete and low-resolution geometry without instance-level annotations, limiting the development of models capable of achieving precise geometric reconstruction, reliable occlusion reasoning, and holistic 3D understanding. To address these challenges, this paper presents an instance-centric benchmark for the 3D panoptic occupancy prediction task. Specifically, we introduce ADMesh, the first unified 3D mesh library tailored for autonomous driving, which integrates over 15K high-quality 3D models with diverse textures and rich semantic annotations. Building upon ADMesh, we further construct CarlaOcc, a large-scale, physically consistent panoptic occupancy dataset generated using the CARLA simulator. This dataset contains over 100K frames with fine-grained, instance-level occupancy ground truth at voxel resolutions as fine as 0.05 m. Furthermore, standardized evaluation metrics are introduced to quantify the quality of existing occupancy datasets. Finally, a systematic benchmark of representative models is established on the proposed dataset, which provides a unified platform for fair comparison and reproducible research in the field of 3D panoptic perception. Code and dataset are available at https://mias.group/CarlaOcc.