| I've spent the last few weekends working on a Qwen3 TTS implementation which is a fork of https://github.com/predict-woo/qwen3-tts.cpp but with more features and cleaner codebase: https://github.com/Danmoreng/qwen3-tts.cpp It currently supports:
I also built a desktop app UI for it using Kotlin Multiplatform: https://github.com/Danmoreng/qwen-tts-studio The app must be compiled from source, it works under Windows and Linux. Models still need to be converted to GGUF manually. Both repos are missing a bit of polish. However, it is in a state that I feel comftable posting it here. [link] [comments] |
Qwen3 TTS in C++ with 1.7B support, speaker encoding extraction, and desktop UI
Reddit r/LocalLLaMA / 3/15/2026
💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage
Key Points
- A fork of Qwen3 TTS in C++ adds 1.7B model support, speaker encoding extraction, a JNI interface, and speaker instructions for custom voice models, including voice cloning for 0.6B and 1.7B bases.
- A desktop application UI was built with Kotlin Multiplatform (qwen-tts-studio) to run and test TTS locally on Windows and Linux.
- The project must be compiled from source and requires manual GGUF conversion for models, indicating a DIY workflow and setup steps.
- The post presents the GitHub repos and a preview image, framing the work as a still-in-progress contribution shared for feedback.
Related Articles
MCP Is Quietly Replacing APIs — And Most Developers Haven't Noticed Yet
Dev.to
Stop Guessing Your API Costs: Track LLM Tokens in Real Time
Dev.to
Your AI Agent Is Not Broken. Your Runtime Is
Dev.to
Building an AI-Powered Social Media Content Generator - A Developer's Guide
Dev.to
I Built a Self-Healing AI Trading Bot That Learns From Every Failure
Dev.to