Hi everyone,
I’ve been working with the new Qwen3-TTS models lately and realized that while the base models are great, the fine-tuning process can be a bit of a headache for many. To solve this, I created Qwen3-TTS-EasyFinetuning.
It’s an open-source WebUI designed to make the fine-tuning process as seamless as possible, even if you’re not a command-line wizard.
Key Features: * User-Friendly WebUI: Manage your entire fine-tuning workflow from the browser. * Multi-Speaker Support: I’ve implemented multi-speaker functionality (even ahead of some official implementations) so you can train diverse voice sets. * Streamlined Pipeline: Handles everything from data processing to training and inference testing. * Local Focused: Designed to run on your own hardware, fitting the r/LocalLlama ethos.
Tech Stack: * Based on Qwen3-TTS * Built with Python/Gradio * Optimized for consumer GPUs (Tested on My RTX3080 10G)
I’m still actively developing this and would love to get some feedback from this community. If you're looking to give your local LLM a custom voice, give it a try!
GitHub: https://github.com/mozi1924/Qwen3-TTS-EasyFinetuning
[link] [comments]



![[P] I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM](/_next/image?url=https%3A%2F%2Fexternal-preview.redd.it%2FzgmJOxETuqgqlsgMxeBl7S4gZNDHf_K3U9w883ioT4M.jpeg%3Fwidth%3D320%26crop%3Dsmart%26auto%3Dwebp%26s%3Da63f97b9d03c40b846cd3eaac472e78050020a43&w=3840&q=75)
