Fully local voice AI on iPhone

Reddit r/LocalLLaMA / 3/26/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • A developer describes building a fully local voice AI experience on an iPhone 15 to eliminate server costs and keep a voice-learning service free and sustainable.
  • The setup uses FluidAudio to offload speech-to-text (STT) and text-to-speech (TTS) to the iPhone’s Neural Engine, allowing llama.cpp to run more effectively on the GPU without contention.
  • They report that the on-device implementation performs better than expected and mention using a home server approach as a prior step.
  • A GitHub repository (volocal) is shared so others can replicate the approach.
Fully local voice AI on iPhone

I'm self-hosting a totally free voice AI on my home server to help people learn speaking English. It has tens to hundreds of monthly active users, and I've been thinking on how to keep it free while making it sustainable.

The ultimate way to reduce the operational costs is to run everything on-device, eliminating any server cost. So I decided to replicate the voice AI experience to fully run locally on my iPhone 15, and it's working better than I expected.

One key thing that makes the app possible is using FluidAudio to offload STT and TTS to the Neural Engine, so llama.cpp can fully utilize the GPU without any contention.

Repo: https://github.com/fikrikarim/volocal

submitted by /u/ffinzy
[link] [comments]