Praxy Voice: Voice-Prompt Recovery + BUPS for Commercial-Class Indic TTS from a Frozen Non-Indic Base at Zero Commercial-Training-Data Cost
arXiv cs.CL / 4/29/2026
💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- The paper proposes Praxy Voice to upgrade a non-native Indic multilingual TTS base into commercial-class output for Telugu, Tamil, and Hindi without training a new acoustic decoder or using any commercial TTS training data.
- It combines three techniques: BUPS (Brahmic Unified Phoneme Space) for deterministic romanization into ISO-15919, a LoRA adapter trained on ~1,220 hours of licensed Indic audio targeting the text-token predictor, and a voice-prompt recovery recipe using short reference clips plus sampling overrides.
- Results on pilot evaluations using the PSP benchmark show Praxy Voice matching or slightly outperforming commercial baselines on multiple phonological measures, including low error on Hindi (LLM-WER) and reduced “collapse” rates for Telugu and Tamil.
- For Hindi, where the LoRA reduced accuracy, the system uses a two-branch deployment that falls back to the vanilla base with the voice-prompt “Config B” recipe.
- The authors also address intra-sentential code-mixing by adding a third branch that uses IndicF5 with native-script transliteration, significantly reducing code-mix LLM-WER, and they release R6 LoRA weights, inference code/router, and a Gradio demo.
Related Articles

Black Hat USA
AI Business
LLMs will be a commodity
Reddit r/artificial

Indian Developers: How to Build AI Side Income with $0 Capital in 2026
Dev.to

HubSpot Just Legitimized AEO: What It Means for Your Brand AI Visibility
Dev.to

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally
Reddit r/LocalLLaMA