Microsoft's MAI-Transcribe-1 runs 2.5x faster than its predecessor at $0.36 per audio hour

THE DECODER / 4/3/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • Microsoft’s MAI-Transcribe-1 is a speech-to-text model that supports 25 languages and maintains accuracy even in noisy environments.
  • The new model runs about 2.5x faster than its predecessor while reportedly reducing cost to $0.36 per audio hour.
  • Microsoft is already using MAI-Transcribe-1 inside its own products, indicating near-term deployment rather than just research.
  • The release positions MAI-Transcribe-1 as a more cost-effective and higher-throughput option for transcription workloads that need real-time or large-scale processing.

Abstract version of the Microsoft logo

MAI-Transcribe-1 converts speech to text quickly and accurately in 25 languages, even with background noise. Microsoft is already using the model in its own products.

The article Microsoft's MAI-Transcribe-1 runs 2.5x faster than its predecessor at $0.36 per audio hour appeared first on The Decoder.