Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages
arXiv cs.CL / 3/26/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- Ethio-ASR introduces a suite of multilingual, CTC-based automatic speech recognition models jointly trained for five Ethiopian languages (Amharic, Tigrinya, Oromo, Sidaama, and Wolaytta).
- The models are trained using the WAXAL corpus, leveraging several pre-trained speech encoders and evaluated against strong multilingual baselines such as OmniASR.
- The best Ethio-ASR model reports an average word error rate (WER) of 30.48% on the WAXAL test set, outperforming the best OmniASR result while using substantially fewer parameters.
- The release includes analyses of gender bias, how vowel length and consonant gemination affect ASR errors, and insights into the training dynamics of multilingual CTC systems.
- The authors make the models and codebase publicly available, aiming to address severe underrepresentation of Ethiopian languages in speech technology.
Related Articles
5 Signs Your Consulting Firm Needs AI Agents (Not More Staff)
Dev.to
AgentDesk vs Hiring Another Consultant: A Cost Comparison
Dev.to
"Why Your AI Agent Needs a System 1"
Dev.to
When should we expect TurboQuant?
Reddit r/LocalLLaMA
AI as Your Customs Co-Pilot: Automating HS Code Chaos in Southeast Asia
Dev.to