AnomalyAgent: Agentic Industrial Anomaly Synthesis via Tool-Augmented Reinforcement Learning

arXiv cs.CV / 4/10/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces AnomalyAgent, an agentic framework for industrial anomaly synthesis designed to overcome limitations of prior single-step methods by adding iterative reasoning and optimization.
  • AnomalyAgent operates in a closed loop using five tools—Prompt Generation, Image Generation, Quality Evaluation, Knowledge Retrieval, and Mask Generation—to generate semantically realistic and diverse anomalies.
  • The approach builds structured trajectories from real anomaly images and uses a two-stage training pipeline (supervised fine-tuning followed by tool-augmented reinforcement learning) guided by task, reflection, and behavioral reward components.
  • Experiments on MVTec-AD report improved anomaly generation metrics and downstream anomaly detection performance, exceeding zero-shot state-of-the-art baselines, with the authors stating that code and data will be publicly released.

Abstract

Industrial anomaly generation is a crucial method for alleviating the data scarcity problem in anomaly detection tasks. Most existing anomaly synthesis methods rely on single-step generation mechanisms, lacking complex reasoning and iterative optimization capabilities, making it difficult to generate anomaly samples with high semantic realism. We propose AnomalyAgent, an anomaly synthesis agent with self-reflection, knowledge retrieval, and iterative refinement capabilities, aiming to generate realistic and diverse anomalies. Specifically, AnomalyAgent is equipped with five tools: Prompt Generation (PG), Image Generation (IG), Quality Evaluation (QE), Knowledge Retrieval (KR), and Mask Generation (MG), enabling closed-loop optimization. To improve decision-making and self-reflection, we construct structured trajectories from real anomaly images and design a two-stage training framework: supervised fine-tuning followed by reinforcement learning. This process is driven by a three-part reward mechanism: (1) task rewards to supervise the quality and location rationality of generated anomalies; (2) reflection rewards to train the model's ability to improve anomaly synthesis prompt; (3) behavioral rewards to ensure adherence to the trajectory. On the MVTec-AD dataset, AnomalyAgent achieves IS/IC-L of 2.10/0.33 for anomaly generation, 57.0% classification accuracy using ResNet34, and 99.3%/74.2% AP at the image/pixel level using a simple UNet, surpassing all zero-shot SOTA methods. The code and data will be made publicly available.