supertonic  by supertone-inc

Lightning-fast, on-device Text-to-Speech (TTS)

Created 3 months ago
2,636 stars

Top 17.5% on SourcePulse

GitHubView on GitHub
Project Summary

Supertonic provides a lightning-fast, on-device text-to-speech (TTS) system using ONNX Runtime for extreme performance and minimal computational overhead. It offers complete privacy and zero latency, targeting developers needing efficient TTS across diverse platforms without cloud dependencies.

How It Works

This system utilizes ONNX Runtime for cross-platform, on-device inference, featuring a lightweight 66M parameter model. Its core advantage lies in local processing, ensuring privacy and zero latency, coupled with efficient handling of complex text inputs without pre-processing.

Quick Start & Requirements

  • Install: Clone the repository. Download ONNX models and voices from Hugging Face Hub using Git LFS.
  • Prerequisites: Git LFS installation.
  • Run: Examples provided for Python, Node.js, Browser (WebGPU/WASM), Java, C++, C#, Go, Swift, iOS, and Rust, each with specific build/run commands.
  • Docs/Demo: Interactive Demo (in-browser), Hugging Face Hub models, Raspberry Pi demo video.

Highlighted Details

  • Performance: Up to 167x faster than real-time on consumer hardware (M4 Pro CPU), achieving very low Real-time Factors (e.g., 0.001 on RTX4090).
  • Lightweight: 66M parameters, optimized for minimal footprint.
  • On-Device: Guarantees privacy and zero latency.
  • Natural Text Handling: Processes numbers, dates, currency, abbreviations, and complex expressions natively.
  • Multi-Platform: Broad ecosystem support including web, mobile, and server-side languages.

Maintenance & Community

Copyright held by Supertone Inc. Associated research papers are recent (2025 arXiv preprints). No specific community channels or detailed maintenance status are provided in the README.

Licensing & Compatibility

Sample code is MIT licensed. The model is under the OpenRAIL-M License. PyTorch (training dependency) uses BSD 3-Clause. The MIT license is generally permissive for commercial use; OpenRAIL-M requires review for responsible AI usage terms.

Limitations & Caveats

GPU mode for ONNX Runtime inference is noted as untested. The README does not detail bus factor or specific deprecation warnings.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
6
Star History
127 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

moonshine by moonshine-ai

9.0%
4k
Speech-to-text models optimized for fast, accurate ASR on edge devices
Created 1 year ago
Updated 2 days ago
Feedback? Help us improve.