ASCAT: An Arabic Scientific Corpus and Benchmark for Advanced Translation Evaluation
arXiv cs.CL / 4/3/2026
💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- The paper introduces ASCAT, a high-quality English–Arabic parallel corpus and benchmark specifically built for evaluating scientific translation of full abstracts rather than short or single-domain sentences.
- ASCAT is constructed via a systematic multi-engine translation pipeline using generative AI (Gemini), transformer-based models (Hugging Face quickmt-en-ar), and commercial MT APIs (Google Translate, DeepL), followed by human expert validation across lexical, syntactic, and semantic levels.
- The benchmark covers five scientific domains—physics, mathematics, computer science, quantum mechanics, and artificial intelligence—and each abstract averages about 141.7 English words and 111.78 Arabic words.
- The released corpus statistics include 67,293 English tokens and 60,026 Arabic tokens with an Arabic vocabulary of 17,604 unique words, reflecting Arabic’s morphological richness.
- When evaluated on three state-of-the-art LLMs (GPT-4o-mini, Gemini-3.0-Flash-Preview, and Qwen3-235B-A22B), ASCAT yields BLEU scores of 37.07, 30.44, and 23.68 respectively, demonstrating its discriminative value for scientific MT evaluation and domain model training.




