Unsupervised Graph Modeling for Anomaly Detection in Accounting Subject Relationships

arXiv cs.LG / 4/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes an unsupervised graph neural network framework to detect anomalies in accounting subject relationship structures without needing labeled anomaly data.
  • It represents accounting subjects as graph nodes and encodes co-occurrence plus debit/credit correspondence from business records as weighted edges to build period-level association graphs.
  • Using message passing, the method learns node embeddings that capture both subject attributes and neighborhood structural context.
  • For detection, it reconstructs/decodes subject-pair relations to compute edge-level anomaly scores from reconstruction probability deviations, then aggregates them into node-level risk rankings and local anomaly locations.
  • Experiments on accounting data show improved discriminative stability and higher accuracy in top-ranked anomaly identification compared with baselines, with traceable subject-pair risk clues.

Abstract

This paper addresses the problem of anomaly detection in accounting subject association structures, proposing a structured modeling and unsupervised discriminant framework based on graph neural networks. This framework is used to mine stable correspondences between subjects and identify structural deviations from general ledger details and voucher entries. The method first abstracts accounting subjects as graph nodes, and the co-occurrence and debit/credit correspondence of subjects in the same business record are abstracted as weighted edges. The edge weights are characterized by statistical measures such as co-occurrence frequency or amount aggregation, thus forming a period-level accounting subject association graph. In the representation learning stage, a message passing mechanism is used to fuse the node's own attributes and neighborhood context to obtain node embeddings containing structural information. In the anomaly detection stage, the rationality of subject pair connections is estimated through a relation reconstruction decoder, and edge-level anomaly scores are defined based on the degree of deviation in reconstruction probabilities. These scores are then aggregated to obtain node-level risk ranking and local anomaly localization. This framework can simultaneously capture local substructure anomalies and cross-community anomaly connections without relying on anomaly labeling, outputting traceable subject pair risk clues. Comparative experiments demonstrate more stable comprehensive discriminant capabilities and higher top-ranking accuracy.