Feature Perturbation Pool-based Fusion Network for Unified Multi-Class Industrial Defect Detection

arXiv cs.CV / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces FPFNet, a unified multi-class industrial defect detection model aimed at avoiding the need to train separate networks per defect category.
FPFNet uses a stochastic feature perturbation pool that injects diverse noise patterns (Gaussian noise, F-Noise, and F-Drop) into feature representations to improve robustness to domain shifts and previously unseen defect morphologies.
A multi-layer feature fusion module with residual connections and normalization combines hierarchical features from both encoder and decoder to capture cross-scale relationships while preserving spatial detail for accurate defect localization.
Built on the UniAD architecture, the approach reports state-of-the-art results on MVTec-AD and VisA with improvements in both image-level and pixel-level AUROC, while adding no extra learnable parameters or computational complexity.
Experiments suggest the method mitigates degraded performance commonly caused by inter-class feature perturbation when multiple defect categories are modeled together.

Abstract

Multi-class defect detection constitutes a critical yet challenging task in industrial quality inspection, where existing approaches typically suffer from two fundamental limitations: (i) the necessity of training separate models for each defect category, resulting in substantial computational and memory overhead, and (ii) degraded robustness caused by inter-class feature perturbation when heterogeneous defect categories are jointly modeled. In this paper, we present FPFNet, a Feature Perturbation Pool-based Fusion Network that synergistically integrates a stochastic feature perturbation pool with a multi-layer feature fusion strategy to address these challenges within a unified detection framework. The feature perturbation pool enriches the training distribution by randomly injecting diverse noise patterns -- including Gaussian noise, F-Noise, and F-Drop -- into the extracted feature representations, thereby strengthening the model's robustness against domain shifts and unseen defect morphologies. Concurrently, the multi-layer feature fusion module aggregates hierarchical feature representations from both the encoder and decoder through residual connections and normalization, enabling the network to capture complex cross-scale relationships while preserving fine-grained spatial details essential for precise defect localization. Built upon the UniAD architecture~\cite{you2022unified}, our method achieves state-of-the-art performance on two widely adopted benchmarks: 97.17\% image-level AUROC and 96.93\% pixel-level AUROC on MVTec-AD, and 91.08\% image-level AUROC and 99.08\% pixel-level AUROC on VisA, surpassing existing methods by notable margins while introducing no additional learnable parameters or computational complexity.

Autoencoders and Representation Learning in Vision

Dev.to

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks

Dev.to

Context Bloat in AI Agents

Dev.to

We open sourced the AI dev team that builds our product

Dev.to

Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B

Reddit r/LocalLLaMA

Feature Perturbation Pool-based Fusion Network for Unified Multi-Class Industrial Defect Detection

Key Points

Abstract

Related Articles

Autoencoders and Representation Learning in Vision

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks

Context Bloat in AI Agents

We open sourced the AI dev team that builds our product

Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer