SwissGov-RSD: A Human-annotated, Cross-lingual Benchmark for Token-level Recognition of Semantic Differences Between Related Documents
arXiv cs.CL / 3/13/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- SwissGov-RSD is introduced as a naturalistic, document-level cross-lingual benchmark for token-level recognition of semantic differences across related documents.
- It covers 224 multi-parallel English-German, English-French, and English-Italian documents with human-annotated token-level difference labels, enabling cross-language evaluation.
- The work evaluates a range of open-source and closed-source LLMs and encoder models under various fine-tuning settings, revealing substantial gaps relative to monolingual or synthetic benchmarks.
- The authors release code and datasets publicly to support replication and further research.
Related Articles
[R] Combining Identity Anchors + Permission Hierarchies achieves 100% refusal in abliterated LLMs — system prompt only, no fine-tuning
Reddit r/MachineLearning
How I Built an AI SDR Agent That Finds Leads and Writes Personalized Cold Emails
Dev.to
Complete Guide: How To Make Money With Ai
Dev.to
I Analyzed My Portfolio with AI and Scored 53/100 — Here's How I Fixed It to 85+
Dev.to
The Demethylation
Dev.to