Audio processing landed in llama-server with Gemma-4

Reddit r/LocalLLaMA / 4/13/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • llama.cpp’s llama-server has added speech-to-text (STT) audio processing support using the Gemma-4 E2A and E4A models.
  • This extends local LLM server capabilities beyond text generation to include transcription from audio inputs.
  • The update is reported via the LocalLLaMA community, highlighting a new capability for on-device or self-hosted deployments.
  • Users integrating llama-server can now route audio to Gemma-4-powered STT workflows within the same server stack.
Audio processing landed in llama-server with Gemma-4

https://preview.redd.it/lsuwsm085sug1.png?width=1588&format=png&auto=webp&s=e87631511cd85977a9dbfa1cd8283a7bb0280538

Ladies and gentlemen, it is a great pleasure the confirm that llama.cpp (llama-server) now supports STT with Gemma-4 E2A and E4A models.

submitted by /u/srigi
[link] [comments]