H2VLR: Heterogeneous Hypergraph Vision-Language Reasoning for Few-Shot Anomaly Detection

arXiv cs.CV / 4/17/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces H2VLR, a framework for few-shot anomaly detection that leverages vision-language reasoning rather than relying on simple feature matching.
  • Existing VLM-based FSAD methods are criticized for treating anomaly inference as mostly pairwise matching and for ignoring structural dependencies and global consistency.
  • H2VLR reformulates FSAD as a high-order inference problem by building a heterogeneous hypergraph that jointly models visual regions and semantic concepts.
  • Experiments on industrial and medical benchmarks show that H2VLR achieves state-of-the-art performance in representative few-shot anomaly detection settings.
  • The authors plan to release the code after acceptance, enabling further validation and reuse by the community.

Abstract

As a classic vision task, anomaly detection has been widely applied in industrial inspection and medical imaging. In this task, data scarcity is often a frequently-faced issue. To solve it, the few-shot anomaly detection (FSAD) scheme is attracting increasing attention. In recent years, beyond traditional visual paradigm, Vision-Language Model (VLM) has been extensively explored to boost this field. However, in currently-existing VLM-based FSAD schemes, almost all perform anomaly inference only by pairwise feature matching, ignoring structural dependencies and global consistency. To further redound to FSAD via VLM, we propose a Heterogeneous Hypergraph Vision-Language Reasoning (H2VLR) framework. It reformulates the FSAD as a high-order inference problem of visual-semantic relations, by jointly modeling visual regions and semantic concepts in a unified hypergraph. Experimental comparisons verify the effectiveness and advantages of H2VLR. It could often achieve state-of-the-art (SOTA) performance on representative industrial and medical benchmarks. Our code will be released upon acceptance.