AI Navigate

Hierarchical Reference Sets for Robust Unsupervised Detection of Scattered and Clustered Outliers

arXiv cs.AI / 3/16/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a novel hierarchical reference set approach that uses graph structures to enable unsupervised detection of both scattered and clustered outliers in IoT data.
  • It leverages local and global reference sets derived from the graph to evaluate anomalies from multiple perspectives, helping to separate scattered outliers from clustered ones.
  • The method is designed to avoid interference from clustered anomalies when identifying scattered outliers and to reflect and isolate clustered outlier groups through the graph.
  • Extensive experiments, including comparisons, ablation studies, downstream clustering validation, and hyperparameter sensitivity analyses, demonstrate the approach's effectiveness.
  • The authors provide source code at GitHub, enabling practitioners to apply the method to IoT anomaly detection and clustering tasks.

Abstract

Most real-world IoT data analysis tasks, such as clustering and anomaly event detection, are unsupervised and highly susceptible to the presence of outliers. In addition to sporadic scattered outliers caused by factors such as faulty sensor readings, IoT systems often exhibit clustered outliers. These occur when multiple devices or nodes produce similar anomalous measurements, for instance, owing to localized interference, emerging security threats, or regional false alarms, forming micro-clusters. These clustered outliers can be easily mistaken for normal behavior because of their relatively high local density, thereby obscuring the detection of both scattered and contextual anomalies. To address this, we propose a novel outlier detection paradigm that leverages the natural neighboring relationships using graph structures. This facilitates multi-perspective anomaly evaluation by incorporating reference sets at both local and global scales derived from the graph. Our approach enables the effective recognition of scattered outliers without interference from clustered anomalies, whereas the graph structure simultaneously helps reflect and isolate clustered outlier groups. Extensive experiments, including comparative performance analysis, ablation studies, validation on downstream clustering tasks, and evaluation of hyperparameter sensitivity, demonstrate the efficacy of the proposed method. The source code is available at https://github.com/gordonlok/DROD.