RespondeoQA: a Benchmark for Bilingual Latin-English Question Answering
arXiv cs.CL / 4/23/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research
Key Points
- The paper introduces RespondeoQA, a bilingual Latin-English question answering and translation benchmark with about 7,800 question-answer pairs.
- Questions are sourced from Latin pedagogical materials such as exams, quizbowl-style trivia, and textbooks spanning the 1800s to the present, and are curated via automated extraction, cleaning, and manual review.
- The benchmark includes multiple task types, including knowledge/skill questions, multihop reasoning, constrained translation, and mixed-language pairs.
- In evaluations of three large language models (LLaMa 3, Qwen QwQ, and OpenAI o3-mini), all models generally perform worse on skill-oriented questions, with reasoning-focused models doing better on scansion and literary-device tasks.
- The dataset is released publicly and the authors note the construction pipeline can be adapted to benchmark other languages.
Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Dev.to

Elevating Austria: Google invests in its first data center in the Alps.
Google Blog

10 AI Tools Every Developer Should Try in 2026
Dev.to

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago
Dev.to