Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India
arXiv cs.CL / 4/22/2026
📰 NewsModels & Research
Key Points
- The article introduces Voice of India, a large-scale, closed-source benchmark aimed at evaluating real-world speech recognition for Indian languages using unscripted telephonic conversations rather than scripted speech.
- It covers 15 major Indian languages across 139 regional clusters and includes 306,230 utterances totaling 536 hours from 36,691 speakers, with transcripts designed to reflect real spelling variations.
- The benchmark is intended to reduce dataset-specific overfitting and better reflect natural language phenomena that can be unfairly penalized by strict single-reference WER in Indic and code-mixed settings.
- The authors analyze ASR performance geographically at the district level and across factors including audio quality, speaking rate, gender, and device type, identifying where current systems underperform.
- They conclude by offering actionable insights for improving Indic ASR systems for real-world deployment across diverse regions and recording conditions.
Related Articles

Autoencoders and Representation Learning in Vision
Dev.to

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks
Dev.to
Context Bloat in AI Agents
Dev.to

We open sourced the AI dev team that builds our product
Dev.to

Intel LLM-Scaler vllm-0.14.0-b8.2 released with official Arc Pro B70 support
Reddit r/artificial