Rule-State Inference (RSI): A Bayesian Framework for Compliance Monitoring in Rule-Governed Domains

arXiv stat.ML / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that conventional ML compliance monitoring methods wrongly assume observed data is ground truth, which fails in domains like taxation where rules are known a priori and key variables are latent and only partially observed.
  • It proposes Rule-State Inference (RSI), a Bayesian approach that encodes regulatory rules as structured priors and performs posterior inference over a latent rule-state space capturing rule activation, compliance rate, and parametric drift.
  • The authors provide three theoretical guarantees: fast absorption of regulatory changes via prior ratio correction, Bernstein-von Mises consistency of the posterior as data accumulates, and monotonic ELBO improvement under mean-field variational inference.
  • RSI is evaluated on a Togolese fiscal-system instantiation using a new benchmark (RSI-Togo-Fiscal-Synthetic v1.0) built from real OTR rules (2022–2025), showing no labeled training data and performance of F1=0.519 and AUC=0.599.
  • The framework shows substantial runtime advantages, absorbing regulatory changes in under 1ms compared with 683–1082ms for full retraining, reported as at least ~600× faster.

Abstract

Existing machine learning frameworks for compliance monitoring -- Markov Logic Networks, Probabilistic Soft Logic, supervised models -- share a fundamental paradigm: they treat observed data as ground truth and attempt to approximate rules from it. This assumption breaks down in rule-governed domains such as taxation or regulatory compliance, where authoritative rules are known a priori and the true challenge is to infer the latent state of rule activation, compliance, and parametric drift from partial and noisy observations. We propose Rule-State Inference (RSI), a Bayesian framework that inverts this paradigm by encoding regulatory rules as structured priors and casting compliance monitoring as posterior inference over a latent rule-state space S = {(a_i, c_i, delta_i)}, where a_i captures rule activation, c_i models the compliance rate, and delta_i quantifies parametric drift. We prove three theoretical guarantees: (T1) RSI absorbs regulatory changes in O(1) time via a prior ratio correction, independently of dataset size; (T2) the posterior is Bernstein-von Mises consistent, converging to the true rule state as observations accumulate; (T3) mean-field variational inference monotonically maximizes the Evidence Lower BOund (ELBO). We instantiate RSI on the Togolese fiscal system and introduce RSI-Togo-Fiscal-Synthetic v1.0, a benchmark of 2,000 synthetic enterprises grounded in real OTR regulatory rules (2022-2025). Without any labeled training data, RSI achieves F1=0.519 and AUC=0.599, while absorbing regulatory changes in under 1ms versus 683-1082ms for full model retraining -- at least a 600x speedup.