Multimodal Industrial Anomaly Detection via Geometric Prior

arXiv cs.CV / 3/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper targets multimodal industrial anomaly detection for geometric defects that are hard to capture with 2D approaches, such as subtle surface deformations and irregular contours.
  • It introduces GPAD, which uses a point-cloud “expert” to extract fine-grained geometric features, including differential computation of normal vectors to form a geometric prior.
  • A two-stage fusion strategy is proposed to combine multimodal inputs while effectively leveraging the geometric prior from 3D point data.
  • The method further applies attention-based fusion and anomaly-region segmentation grounded in the geometric prior to improve defect perception.
  • Experiments report that GPAD achieves state-of-the-art detection accuracy on the MVTec-3D AD and Eyecandies datasets.

Abstract

The purpose of multimodal industrial anomaly detection is to detect complex geometric shape defects such as subtle surface deformations and irregular contours that are difficult to detect in 2D-based methods. However, current multimodal industrial anomaly detection lacks the effective use of crucial geometric information like surface normal vectors and 3D shape topology, resulting in low detection accuracy. In this paper, we propose a novel Geometric Prior-based Anomaly Detection network (GPAD). Firstly, we propose a point cloud expert model to perform fine-grained geometric feature extraction, employing differential normal vector computation to enhance the geometric details of the extracted features and generate geometric prior. Secondly, we propose a two-stage fusion strategy to efficiently leverage the complementarity of multimodal data as well as the geometric prior inherent in 3D points. We further propose attention fusion and anomaly regions segmentation based on geometric prior, which enhance the model's ability to perceive geometric defects. Extensive experiments show that our multimodal industrial anomaly detection model outperforms the State-of-the-art (SOTA) methods in detection accuracy on both MVTec-3D AD and Eyecandies datasets.