Genie  by High-Logic

CPU-optimized TTS inference engine

Created 3 weeks ago

New!

595 stars

Top 54.8% on SourcePulse

GitHubView on GitHub
Project Summary

Genie is a lightweight inference engine and model converter for the GPT-SoVITS speech synthesis project. It targets users who need efficient, CPU-based speech synthesis with low latency and a small runtime footprint, offering a convenient API server and model conversion tools.

How It Works

Genie optimizes the GPT-SoVITS V2 model for CPU inference, achieving significantly lower latency and a much smaller runtime size compared to the official PyTorch or ONNX models. This is accomplished through ONNX model conversion and specific optimizations tailored for CPU performance, making it suitable for applications where GPU resources are limited or not cost-effective.

Quick Start & Requirements

  • Installation: pip install genie-tts
  • Prerequisites: Python >= 3.9. Windows users may need Visual Studio Build Tools with the "Desktop development with C++" workload for pyopenjtalk installation.
  • Quick Tryout: Includes predefined characters for immediate use without requiring model files.
  • Documentation: Demo Video, API Server Tutorial

Highlighted Details

  • Achieves 1.13s first inference latency on CPU (i7-13620H), outperforming official PyTorch (1.35s) and ONNX (3.57s) models.
  • Runtime size is approximately 200MB, with model sizes around 230MB, significantly smaller than the multi-GB official PyTorch models.
  • Supports GPT-SoVITS V2 models and Japanese language.
  • Includes tools for ONNX model conversion and a FastAPI server for API access.

Maintenance & Community

  • The project is actively maintained by High-Logic.
  • Roadmap includes support for more languages (Chinese, English), future GPT-SoVITS versions (V2Proplus, V3, V4), and easier deployment options like Docker images.

Licensing & Compatibility

  • The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

  • Currently supports only GPT-SoVITS V2 models and Japanese language.
  • The project is focused on CPU inference, with no explicit mention of GPU acceleration benefits.
  • Installation of pyopenjtalk may require C++ build tools on Windows.
Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
2
Star History
597 stars in the last 24 days

Explore Similar Projects

Feedback? Help us improve.