OpenAI Releases Three Realtime Audio Models: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper in the Realtime API

MarkTechPost / 5/8/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageIndustry & Market MovesModels & Research

共有:

Key Points

OpenAI has released three new purpose-built realtime audio models—GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper—available via its Realtime API.
GPT-Realtime-2 is aimed at enabling realtime voice reasoning agents.
GPT-Realtime-Translate supports speech translation across 70+ languages for live, interactive translation use cases.
GPT-Realtime-Whisper focuses on streaming transcription, allowing developers to transcribe speech in real time.

Three purpose-built audio models expand what developers can build with live voice: reasoning agents, speech translation across 70+ languages, and streaming transcription.

The post OpenAI Releases Three Realtime Audio Models: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper in the Realtime API appeared first on MarkTechPost.