A Feature Shuffling and Restoration Strategy for Universal Unsupervised Anomaly Detection

arXiv cs.CV / 3/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a common failure mode in reconstruction-based unsupervised anomaly detection, where “identical shortcuts” let both normal and anomalous regions be reconstructed well, leading to poor outlier detection.
  • It proposes FSR (Feature Shuffling and Restoration), which reconstructs multi-scale, semantically rich feature targets instead of raw pixels to improve robustness across different data distributions.
  • FSR shuffles non-overlapping multi-scale feature blocks and then restores them, pushing the model to rely more on global context rather than local pixel-level cues.
  • A new shuffling-rate concept is introduced to control task difficulty and mitigate the identical shortcut problem across settings.
  • The authors provide theoretical justification (network structure and mutual information) and report extensive experiments showing improved and more transferable performance, with code released on GitHub.

Abstract

Unsupervised anomaly detection is vital in industrial fields, with reconstruction-based methods favored for their simplicity and effectiveness. However, reconstruction methods often encounter an identical shortcut issue, where both normal and anomalous regions can be well reconstructed and fail to identify outliers. The severity of this problem increases with the complexity of the normal data distribution. Consequently, existing methods may exhibit excellent detection performance in a specific scenario, but their performance sharply declines when transferred to another scenario. This paper focuses on establishing a universal model applicable to anomaly detection tasks across different settings, termed as universal anomaly detection. In this work, we introduce a novel, straightforward yet efficient framework for universal anomaly detection: \uline{F}eature \uline{S}huffling and \uline{R}estoration (FSR), which can alleviate the identical shortcut issue across different settings. First and foremost, FSR employs multi-scale features with rich semantic information as reconstruction targets, rather than raw image pixels. Subsequently, these multi-scale features are partitioned into non-overlapping feature blocks, which are randomly shuffled and then restored to their original state using a restoration network. This simple paradigm encourages the model to focus more on global contextual information. Additionally, we introduce a novel concept, the shuffling rate, to regulate the complexity of the FSR task, thereby alleviating the identical shortcut across different settings. Furthermore, we provide theoretical explanations for the effectiveness of FSR framework from two perspectives: network structure and mutual information. Extensive experimental results validate the superiority and efficiency of the FSR framework across different settings.Code is available at https://github.com/luow23/FSR.