Mistral's first open-weight TTS model Voxtral clones voices from three seconds of audio across nine languages

THE DECODER / 3/27/2026

📰 NewsSignals & Early TrendsIndustry & Market MovesModels & Research

Key Points

  • Mistral has released Voxtral, its first open-weight text-to-speech (TTS) model supporting nine languages.
  • Voxtral can clone a speaker’s voice using only three seconds of reference audio.
  • The release positions Mistral as a direct competitor in the fast-growing voice generation and TTS tooling space.
  • Because the model is open-weight, developers and researchers may be able to experiment with and adapt voice cloning workflows more easily.
  • The nine-language capability broadens potential real-world deployment scenarios beyond a single market or language.

French AI startup Mistral has released Voxtral TTS, its first text-to-speech model that supports nine languages and can clone voices from just three seconds of audio.

The article Mistral's first open-weight TTS model Voxtral clones voices from three seconds of audio across nine languages appeared first on The Decoder.

Mistral's first open-weight TTS model Voxtral clones voices from three seconds of audio across nine languages | AI Navigate