Reliable Self-Harm Risk Screening via Adaptive Multi-Agent LLM Systems
arXiv cs.AI / 4/27/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that existing “LLM-as-a-judge” evaluations for multi-step self-harm/depression screening lack reliability estimates and cannot explain how errors compound across multiple LLM judgments, making them less suitable for safety-critical use.
- It proposes a statistical framework for multi-agent LLM pipelines represented as DAGs, modeling each agent as a stochastic categorical decision and replacing heuristic voting with adaptive, principled decision-making.
- The method adds agent-level performance confidence bounds, a bandit-based adaptive sampling strategy that adjusts based on input difficulty, and regret guarantees with logarithmic error growth in deployment.
- Experiments on two behavioral-health datasets (AEGIS 2.0, N=161; SWMH Reddit stratified sample, N=250) show notably lower false positive rates, improving precision without increasing false negatives, and reducing incorrect flagging of safe content by about 40% on AEGIS 2.0.
- Overall, the results indicate that adaptive sampling can meaningfully improve reliability/precision in behavioral health risk screening while maintaining recall in the evaluated setting.
Related Articles

The company with a monopoly on AI's most critical machine is racing to build more
THE DECODER

Legal Insight Transformation: 7 Mistakes to Avoid When Adopting AI Tools
Dev.to

Legal Insight Transformation: Traditional vs. AI-Driven Research Compared
Dev.to

Legal Insight Transformation: A Beginner's Guide to Modern Research
Dev.to
The Open Source AI Studio That Nobody's Talking About
Dev.to