Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks

arXiv cs.LG / 4/23/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper addresses the challenge of quickly triaging large numbers of anti-money laundering (AML) alerts under strict audit and governance rules, where unconstrained LLM explanations can be unreliable.
  • It proposes an evidence-constrained, explainable AML triage framework that combines retrieval-augmented evidence bundling, a structured LLM output contract with explicit citations, and separation of supporting versus contradicting or missing evidence.
  • The method adds counterfactual checks to ensure that small, plausible perturbations change the triage decision and rationale in a coherent and faithful way.
  • Experiments on public synthetic AML benchmarks show improved auditability and fewer hallucination errors, with the best overall triage performance (PR-AUC 0.75; Escalate F1 0.62) and strong provenance/faithfulness metrics (citation validity 0.98; evidence support 0.88; counterfactual faithfulness 0.76).
  • The authors conclude that governed and verifiable LLM systems can deliver practical decision support for AML triage while maintaining traceability and defensibility for compliance.

Abstract

Anti-money laundering (AML) transaction monitoring generates large volumes of alerts that must be rapidly triaged by investigators under strict audit and governance constraints. While large language models (LLMs) can summarize heterogeneous evidence and draft rationales, unconstrained generation is risky in regulated workflows due to hallucinations, weak provenance, and explanations that are not faithful to the underlying decision. We propose an explainable AML triage framework that treats triage as an evidence-constrained decision process. Our method combines (i) retrieval-augmented evidence bundling from policy/typology guidance, customer context, alert triggers, and transaction subgraphs, (ii) a structured LLM output contract that requires explicit citations and separates supporting from contradicting or missing evidence, and (iii) counterfactual checks that validate whether minimal, plausible perturbations lead to coherent changes in both the triage recommendation and its rationale. We evaluate on public synthetic AML benchmarks and simulators and compare against rules, tabular and graph machine-learning baselines, and LLM-only/RAG-only variants. Results show that evidence grounding substantially improves auditability and reduces numerical and policy hallucination errors, while counterfactual validation further increases decision-linked explainability and robustness, yielding the best overall triage performance (PR-AUC 0.75; Escalate F1 0.62) and strong provenance and faithfulness metrics (citation validity 0.98; evidence support 0.88; counterfactual faithfulness 0.76). These findings indicate that governed, verifiable LLM systems can provide practical decision support for AML triage without sacrificing compliance requirements for traceability and defensibility.