| MiMo-V2.5-ASR is a state-of-the-art end-to-end automatic speech recognition (ASR) model developed by the Xiaomi MiMo team. It is built to deliver accurate and robust transcription across Mandarin Chinese and English, multiple Chinese dialects, code-switched speech, song lyrics, knowledge-intensive content, noisy acoustic environments, and multi-speaker conversations. MiMo-V2.5-ASR achieves state-of-the-art results on a wide range of public benchmarks. AbstractAutomatic speech recognition systems are expected to faithfully transcribe speech signals that originate from diverse languages, dialects, accents, and domains, and that are captured under a wide variety of acoustic conditions. While conventional end-to-end models perform well on in-domain data, they still fall short of real-world requirements in challenging scenarios such as dialect mixing, code-switching, knowledge-intensive content, noisy environments, and multi-speaker conversations. We present MiMo-V2.5-ASR, a large-scale end-to-end speech recognition model developed by the Xiaomi MiMo team. Through large-scale mid-training, high-quality supervised fine-tuning, and a novel reinforcement-learning algorithm, MiMo-V2.5-ASR achieves systematic improvements along the following dimensions:
[link] [comments] |
XiaomiMiMo/MiMo-V2.5-ASR · Hugging Face
Reddit r/LocalLLaMA / 4/24/2026
📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- MiMo-V2.5-ASR is Xiaomi MiMoチームによる最先端のエンドツーエンド音声認識(ASR)モデルで、中国語(複数方言)と英語にまたがる高精度な文字起こしを目指しています。
- 方言混在やコードスイッチ(中国語–英語の切替)に対応し、言語タグなしで自然に書き起こせる設計になっています。
- 雑音下(遠距離集音など)や多話者の重なり会話、知識量の多い内容(固有名詞、地名、専門用語、古典詩など)、さらに歌詞認識にも強い性能を示しています。
- 新たな学習アプローチとして、大規模な中間学習、高品質な教師あり微調整、そして独自の強化学習アルゴリズムにより、複数の評価軸で体系的な改善を達成したとしています。
- 幅広い公開ベンチマークで最先端(SOTA)結果を報告し、英語の難しめベンチマークではOpen ASR Leaderboardでも高い性能を示しています。
Related Articles

Black Hat USA
AI Business
AI productivity tools 2026: top 10 tools for remote teams
Dev.to

How I Use GitHub Copilot + RapidForge to Generate Daily Stock Ideas
Dev.to
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to
Anthropic CVP Run 3 — Does Claude's Safety Stack Scale Down to Haiku 4.5?
Dev.to