ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models
arXiv cs.CL / 5/4/2026
📰 NewsModels & Research
Key Points
- The paper introduces ML-Bench, a policy-grounded multilingual safety benchmark for 14 languages built directly from regional regulations rather than generic risk taxonomies or translation-based approaches.
- ML-Bench derives risk categories and fine-grained rules from jurisdiction-specific legal texts to produce evaluation data that better reflects local cultural and legal requirements.
- Based on ML-Bench, the authors develop ML-Guard, a diffusion LLM (dLLM)-based guardrail model that performs multilingual safety judgments and policy-conditioned compliance assessment.
- ML-Guard is offered in two variants: a 1.5B lightweight model for fast safe/unsafe checks and a 7B model for more capable, customized compliance checking with detailed explanations.
- Experiments across 11 existing guardrail baselines and multiple multilingual safety benchmarks show ML-Guard consistently outperforms prior methods, aiming to support regulation-aware and culturally aligned guardrail systems.
Related Articles
AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI
The Verge

CLMA Frame Test
Dev.to

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions
Dev.to

Roundtable chat with Talkie-1930 and Gemma 4 31B
Reddit r/LocalLLaMA