MOSS-TTS-Nano: a 0.1B open-source multilingual TTS model that runs on 4-core CPU and supports realtime speech generation

Reddit r/LocalLLaMA / 4/12/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • MOSI.AI and the OpenMOSS team have open-sourced MOSS-TTS-Nano, a tiny 0.1B-parameter multilingual TTS model designed for practical deployment.
  • The model supports real-time, streaming speech generation and is built to run on a 4-core CPU without requiring a GPU.
  • It offers multilingual coverage (including Chinese, English, Japanese, Korean, and Arabic) and includes long-text voice cloning capabilities.
  • The project provides simple local deployment through scripts/CLI (infer.py, app.py, and command-line tools) plus an online demo and a Hugging Face Space for quick testing.

We just open-sourced MOSS-TTS-Nano, a tiny multilingual speech generation model from MOSI.AI and the OpenMOSS team.

Some highlights:

  • 0.1B parameters
  • Realtime speech generation
  • Runs on CPU without requiring a GPU
  • Multilingual support (Chinese, English, Japanese, Korean, Arabic, and more)
  • Streaming inference
  • Long-text voice cloning
  • Simple local deployment with infer.py, app.py, and CLI commands

The project is aimed at practical TTS deployment: small footprint, low latency, and easy local setup for demos, lightweight services, and product integration.

GitHub:
https://github.com/OpenMOSS/MOSS-TTS-Nano

Huggingface:

https://huggingface.co/spaces/OpenMOSS-Team/MOSS-TTS-Nano

Online demo:
https://openmoss.github.io/MOSS-TTS-Nano-Demo/

Would love to hear feedback on quality, latency, and what use cases you’d want to try with a tiny open TTS model.

submitted by /u/TimeEnvironmental219
[link] [comments]