VERDICT: Verifiable Evolving Reasoning with Directive-Informed Collegial Teams for Legal Judgment Prediction

arXiv cs.AI / 3/23/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • VERDICT introduces a self-refining, multi-agent framework that simulates a virtual collegial panel for Legal Judgment Prediction, assigning specialized roles such as fact structuring, legal retrieval, opinion drafting, and supervisory verification.
  • It implements a traceable draft-verify-revise workflow with explicit Pass/Reject feedback to generate verifiable reasoning traces and revision rationales.
  • A Hybrid Jurisprudential Memory (HJM), built on the Micro-Directive Paradigm, stores precedent standards and distills validated verification trajectories into updated micro-directives for continual learning across cases.
  • VERDICT achieves state-of-the-art results on the CAIL2018 dataset and shows strong generalization on the newly introduced CJO2025 dataset with a strict future-time split, and the authors release code and data for reproducibility.
  • The work advances interpretable, adaptable LJP capable of evolving with jurisprudential practice, addressing both accuracy and verifiability.

Abstract

Legal Judgment Prediction (LJP) predicts applicable law articles, charges, and penalty terms from case facts. Beyond accuracy, LJP calls for intrinsically interpretable and legally grounded reasoning that can reconcile statutory rules with precedent-informed standards. However, existing methods often behave as static, one-shot predictors, providing limited procedural support for verifiable reasoning and little capability to adapt as jurisprudential practice evolves. We propose VERDICT, a self-refining collaborative multi-agent framework that simulates a virtual collegial panel. VERDICT assigns specialized agents to complementary roles (e.g., fact structuring, legal retrieval, opinion drafting, and supervisory verification) and coordinates them in a traceable draft--verify--revise workflow with explicit Pass/Reject feedback, producing verifiable reasoning traces and revision rationales. To capture evolving case experience, we further introduce a Hybrid Jurisprudential Memory (HJM) grounded in the Micro-Directive Paradigm, which stores precedent standards and continually distills validated multi-agent verification trajectories into updated Micro-Directives for continual learning across cases. We evaluate VERDICT on CAIL2018 and a newly constructed CJO2025 dataset with a strict future time-split for temporal generalization. VERDICT achieves state-of-the-art performance on CAIL2018 and demonstrates strong generalization on CJO2025. To facilitate reproducibility and further research, we release our code and the dataset at https://anonymous.4open.science/r/ARR-4437.