We made significant improvements to the Kokoro TTS trainer

Reddit r/LocalLLaMA / 4/6/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

The post describes significant improvements to the Kokoro TTS training workflow, aimed specifically at making custom voice training more practical for users running locally.
The authors forked an existing CPU-based training tool (KVoiceWalk) into a new project that adds GPU/CUDA support to greatly speed up training.
On an NVIDIA RTX 3060, the updated approach reportedly achieves about a 6.5x speedup compared with CPU training.
A new GUI was added, including a queue system to train multiple voices more easily.
The authors indicate they will also publish or distribute TTS outputs using their own custom voices via a Steam game in the coming days.

We made significant improvements to the Kokoro TTS trainer

Kokoro is a pretty popular tool- for good reason. Can run on CPUs on desktops and phone. We found it pretty useful ourselves, there being only 1 issue- training custom voices. There was a great tool called KVoiceWalk that solved this. Only 1 problem- it only ran on CPU. Took about 26 hours to train a single voice. So we made significant improvements.

We forked into here- https://github.com/BovineOverlord/kvoicewalk-with-GPU-CUDA-and-GUI-queue-system

As the name suggests, we added GPU/CUDA support to the tool. Results were 6.5x faster on a 3060. We also created a GUI for easier use, which includes a queuing system for training multiple voices.

Hope this helps the community. We'll be adding this TTS with our own custom voices to our game the coming days. Let me know if you have any questions!

submitted by /u/TurtletopSoftware
[link] [comments]