BlasBench: An Open Benchmark for Irish Speech Recognition

arXiv cs.CL / 4/14/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • BlasBench is an open Irish-specific ASR evaluation harness that includes Irish-aware text normalisation to preserve linguistic features like fadas, lenition, and eclipsis.
  • The benchmark evaluates 12 end-user ASR systems across four architecture families using Common Voice ga-IE and FLEURS ga-IE under a shared evaluation protocol.
  • Results show that all Whisper variants exceed 100% WER, highlighting challenges for current models on Irish speech recognition.
  • The best open model, omniASR LLM 7B, achieves 30.65% WER on Common Voice and 39.09% WER on FLEURS, setting a new baseline for open Irish ASR.
  • A key finding is a cross-dataset generalisation gap: models fine-tuned on Common Voice lose 33–43 WER points on FLEURS, which can be missed when testing on a single dataset.

Abstract

No open Irish-specific benchmark compares end-user ASR systems under a shared Irish-aware evaluation protocol. To solve this, we release BlasBench, an open evaluation harness with Irish-aware text normalisation that preserves fadas, lenition, and eclipsis. We benchmark 12 systems across four architecture families on Common Voice ga-IE and FLEURS ga-IE. All Whisper variants exceed 100% WER. The best open model (omniASR LLM 7B) achieves 30.65% WER on Common Voice and 39.09% on FLEURS. We noticed models fine-tuned on Common Voice lose 33-43 WER points on FLEURS, revealing a generalisation gap that is invisible to single-dataset evaluation.