Citation Grounding: Detecting and Reducing LLM Citation Hallucinations via Legal Citation Graphs
arXiv cs.CL / 6/2/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- The paper proposes “citation grounding (CG)”, a metric that checks whether LLM-generated legal citations are supported by a ground-truth legal citation graph extracted from 100.8 million Ukrainian court decisions.
- CG evaluates citations along three dimensions—precision (the provision exists), relevance (contextually appropriate), and temporality (valid at the relevant time)—to diagnose different hallucination types.
- Across 100 Ukrainian legal queries evaluated on five LLM systems (including four via AWS Bedrock and one RAG production system), CG scores range from 0.791 to 0.873 and 13–21% of citations are hallucinated.
- To reduce hallucinations without human annotation, the authors introduce “Citation Grounding DPO (CG-DPO)”, which automatically creates preference pairs by corrupting verified citations using targeted strategies, and fine-tunes a Qwen2.5-7B-Instruct (LoRA) model to reliably distinguish correct from corrupted citations.
- The citation graph, evaluation framework, and CG-DPO dataset are released as open resources for further research and benchmarking.
Continue reading this article on the original site.
Read original →



