The words of a meeting are no longer
something you read as subtitles.
Until now, Google Meet's translation was "subtitles that appear at the bottom of the screen." Now it turns the speaking voice itself into a voice in another language—interpreting it a few seconds behind, before the phrase is even finished. What Gemini 3.5 Live Translate changes is the choice, in a foreign-language meeting, between "reading" and "listening."
The gulf that lay between
"subtitles" and "interpretation"
Both Google Meet and Google Translate already had translation features. But those were subtitles based on machine translation—transcribing what the other person said, translating it into another language, and displaying it on screen. Read it and you grasp the meaning. Yet your gaze is tied to the bottom of the screen, and the rhythm of conversation breaks.
When Gemini Omni and 3.5 Flash were announced at Google I/O 2026 three months ago, voice-to-voice "simultaneous interpretation" remained a preview-stage promise, with no fixed date for full availability. Real-time interpretation that preserved the pace of speech and the tone of voice was something no service had achieved.
| Subtitle translation until now | Gemini 3.5 Live Translate |
|---|---|
| Transcribes speech, then displays the translation | Turns the voice directly into a voice in another language |
| You must follow the on-screen subtitles with your eyes | You can keep the conversation going while listening |
| Tone and intonation of the voice are lost | Preserves the speaker's tone, speed, and pitch |
| The translation appears after the sentence ends | Generated continuously a few seconds behind, before you finish speaking |
Stop reading the words.
Hear them in another language, in the speaker's own voice.
It doesn't wait for you to finish
Conventional interpretation features started translating only "after a sentence was complete." Live Translate is streaming-based, beginning to build the translation while you are still speaking.
It picks up the start of speech
From the moment the speaker opens their mouth, it takes in the audio phrase by phrase. Rather than "sequential translation" that waits for a sentence to end, it is a streaming approach that processes the incoming audio as it flows.
Into another language, voice and all
Instead of transcribing it into text to be read aloud, it converts directly from voice to voice. Because it preserves the speaker's tone, speed, and pitch in doing so, the translated voice keeps the nuance of the original delivery.
It follows a few seconds behind
Because it doesn't wait for you to finish, the interpretation keeps following the speaker a few seconds behind. The rhythm of conversation isn't interrupted, and you can exchange a foreign language with something close to the feel of in-person simultaneous interpretation.
It enters meetings,
learning, and your apps
The same Live Translate model can now be used from three entry points.
This time, as Gemini 3.5 Live Translate, a streaming voice-to-voice translation model supporting more than 70 languages has been built into Google Meet, Google Translate, and the Gemini Live API. What makes this release distinctive is that the same model landed at once in three entry points of different character: a meeting tool, a translation app, and a developer-facing API.
Those using it in meetings can listen to the other person through the interpreter's voice instead of reading subtitles. In a language-learning context, the experience of having English you spoke come right back in a Japanese voice can now be tried in the Gemini app. And developers, by going through the Gemini Live API, can embed the same interpretation feature into their own apps.
From this week, you can choose
No special preparation is needed. It appears as a new option right inside the tools you already use.
In meetings with overseas offices
If you use Google Meet, from this week you can choose a meeting with voice interpretation built in, rather than subtitles. Your gaze isn't tied to the bottom of the screen, and you can focus on the conversation.
As a language-learning partner
You can try the experience of having English you spoke come right back in a Japanese voice, in the Gemini app. Because it's translated while preserving your pronunciation and pace, the feel of conversation practice changes.
Embed it in your own app
Via the Gemini Live API, you can embed the interpretation feature into your own service. Whether a meeting app or a learning app, you can drop in the same voice interpretation as is.
"Supported" and "practical"
are two different things
What's worth noting is that 70-language support is, at the end of the day, a matter of coverage (how broad the support is). It doesn't guarantee the same quality across every language pair as between Japanese and English. The accuracy for non-English language pairs varies with the combination you use.
In particular, entrusting an important negotiation—where contract terms and figures are at stake—to voice interpretation alone is overconfidence. It's more realistic to treat it as a tool for getting the rhythm of conversation back, and as one more move that adds the option of "listening interpretation" alongside "reading interpretation." In situations that require confirmation, it's safer to combine it with subtitles or human verification.