Human-1 by Josh Talks: A Full-Duplex Conversational Modeling Framework in Hindi using Real-World Conversations
arXiv cs.CL / 4/28/2026
📰 NewsSignals & Early TrendsModels & Research
Key Points
- The paper introduces “Human-1,” an open, reproducible full-duplex spoken dialogue system for Hindi, designed to handle realistic conversation phenomena like interruptions, overlaps, and backchannels.
- It builds on the Moshi duplex speech architecture by adding a custom Hindi tokenizer and training with 26,000 hours of real spontaneous conversations from 14,695 speakers, using separate speaker channels to learn turn-taking and overlap patterns directly.
- For Hindi text generation, the authors replace the original English tokenizer and reinitialize text-vocabulary-dependent parameters while keeping the pre-trained audio components.
- The training approach uses a two-stage recipe—large-scale pre-training followed by fine-tuning on 1,000 hours of conversational data.
- Experiments using prompted dialogue continuation show, via both automatic metrics and human evaluations, that the model produces natural, meaningful full-duplex conversational behavior in Hindi and aims to extend this to other Indian languages.
Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Everyone Wants AI Agents. Fewer Teams Are Ready for the Messy Business Context Behind Them
Dev.to
Free Registration & $20K Prize Pool: 2nd MLC-SLM Challenge 2026 on Multilingual Speech LLMs [N]
Reddit r/MachineLearning
AI 编程工具对比 2026:Claude Code vs Cursor vs Gemini CLI vs Codex
Dev.to

An improvement of the convergence proof of the ADAM-Optimizer
Dev.to