A Semantically Disentangled Unified Model for Multi-category 3D Anomaly Detection

arXiv cs.CV / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses multi-category 3D anomaly detection in point clouds trained only on normal data, focusing on a failure mode called Inter-Category Entanglement (ICE) where shared latent features lead to incorrect semantic priors and unreliable anomaly scores.
  • It proposes the Semantically Disentangled Unified Model that reconstructs features conditioned on disentangled semantic representations to prevent category feature overlap.
  • The approach combines three components: coarse-to-fine global tokenization for instance-level semantic identity, category-conditioned contrastive learning to separate category semantics, and a geometry-guided decoder for semantically consistent reconstruction.
  • Experiments on Real3D-AD and Anomaly-ShapeNet show state-of-the-art performance for both unified and category-specific settings, with reported object-level AUROC gains of 2.8% (unified) and 9.1% (category-specific) and improved reliability of unified 3D anomaly detection.

Abstract

3D anomaly detection targets the detection and localization of defects in 3D point clouds trained solely on normal data. While a unified model improves scalability by learning across multiple categories, it often suffers from Inter-Category Entanglement (ICE)-where latent features from different categories overlap, causing the model to adopt incorrect semantic priors during reconstruction and ultimately yielding unreliable anomaly scores. To address this issue, we propose the Semantically Disentangled Unified Model for 3D Anomaly Detection, which reconstructs features conditioned on disentangled semantic representations. Our framework consists of three key components: (i) Coarse-to-Fine Global Tokenization for forming instance-level semantic identity, (ii) Category-Conditioned Contrastive Learning for disentangling category semantics, and (iii) a Geometry-Guided Decoder for semantically consistent reconstruction. Extensive experiments on Real3D-AD and Anomaly-ShapeNet demonstrate that our method achieves state-of-the-art for both unified and category-specific models, improving object-level AUROC by 2.8% and 9.1%, respectively, while enhancing the reliability of unified 3D anomaly detection.