| Yesterday, Cohere released their first speech-to-text model, which now tops the OpenASR leaderboard (for English, but the model does support 14 different languages). So, I decided to build a WebGPU demo for it: running the model entirely locally in the browser with Transformers.js. I hope you like it! Link to demo (+ source code): https://huggingface.co/spaces/CohereLabs/Cohere-Transcribe-WebGPU [link] [comments] |
Cohere Transcribe WebGPU: state-of-the-art multilingual speech recognition in your browser
Reddit r/LocalLLaMA / 3/28/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- Cohere released its first speech-to-text model, which is reported to top the OpenASR leaderboard (at least for English) while supporting 14 languages.
- A developer built a WebGPU demo that runs the transcription model entirely locally in the browser using Transformers.js.
- The demo and its source code are published on Hugging Face Spaces, enabling others to test and build similar client-side speech recognition experiences.
- The release highlights the growing feasibility of high-performing multilingual ASR models running on-device, which can improve privacy and reduce latency for browser-based apps.
Related Articles

Black Hat Asia
AI Business

"The Agent Didn't Decide Wrong. The Instructions Were Conflicting — and Nobody Noticed."
Dev.to
Top 5 LLM Gateway Alternatives After the LiteLLM Supply Chain Attack
Dev.to

Stop Counting Prompts — Start Reflecting on AI Fluency
Dev.to

Reliable Function Calling in Deeply Recursive Union Types: Fixing Qwen Models' Double-Stringify Bug
Dev.to