MOSS-TTS-Nano  by OpenMOSS

Tiny, multilingual TTS for real-time, CPU-friendly deployment

Created 1 month ago
3,183 stars

Top 14.7% on SourcePulse

GitHubView on GitHub
Project Summary

Summary MOSS-TTS-Nano is an open-source, multilingual, tiny speech generation model (0.1B parameters) engineered for real-time applications. It prioritizes low latency, CPU-only inference, and a simplified deployment stack, targeting local demos, web serving, and lightweight product integration.

How It Works The core architecture utilizes a pure autoregressive pipeline, integrating MOSS-Audio-Tokenizer-Nano with a lightweight LLM. This design emphasizes a minimal footprint and low latency, enabling streaming inference directly on CPU without GPU dependency. The tokenizer, based on a CNN-free causal Transformer, achieves high-fidelity audio reconstruction by compressing audio into an efficient token stream.

Quick Start & Requirements Installation is recommended within a Python 3.12 Conda environment. Post-cloning, install dependencies via pip install -r requirements.txt and the project in editable mode (pip install -e .) to enable the moss-tts-nano CLI. Manual installation of pynini=2.1.6.post1 may be required for WeTextProcessing.

Highlighted Details

  • Model Size: Compact 0.1 billion parameters, suitable for CPU inference.
  • Multilingual: Supports 20 languages, including Chinese, English, Japanese, Korean, Spanish, French, German, and more.
  • Audio Output: Native 48 kHz, 2-channel audio with high fidelity, compressed via a 12.5 Hz token stream.
  • Real-time Capability: Streaming inference achieves low latency, operational on a 4-core CPU.
  • Voice Cloning: Integrated voice cloning workflow via infer.py and CLI, requiring only a short reference clip.
  • Deployment Flexibility: Supports direct Python scripts, a local FastAPI web demo, and a packaged CLI.

Maintenance & Community The README does not specify community channels (e.g., Discord, Slack) or list notable contributors or sponsorships.

Licensing & Compatibility The project intends to follow a root LICENSE file. However, until its publication, the repository is to be treated as "not yet licensed for redistribution," potentially impacting commercial use or integration into closed-source projects.

Limitations & Caveats The primary limitation is the current lack of a published license, rendering the project "not yet licensed for redistribution" and posing a significant adoption barrier until licensing terms are clarified.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
25
Star History
1,004 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.