Chatterbox-TTS-Server  by devnen

Self-host a powerful TTS server with a web UI and API

Created 3 months ago
525 stars

Top 60.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a self-hostable server for the Chatterbox TTS model, offering a user-friendly web UI and an OpenAI-compatible API. It targets developers and users needing to generate high-quality speech, perform voice cloning, and process large text volumes for applications like audiobook creation, with accelerated performance on NVIDIA, AMD, and Apple Silicon hardware.

How It Works

The server leverages the Chatterbox TTS engine, enhanced with a FastAPI backend for robust API and UI functionality. It intelligently chunks long text inputs based on sentence structure for seamless audio concatenation, supports voice cloning via reference audio, and offers predefined voices for consistent output. Generation consistency is further improved by an optional seed parameter.

Quick Start & Requirements

  • Installation: Clone the repository, create a Python virtual environment, and install dependencies using pip install -r requirements-nvidia.txt (for NVIDIA), requirements-rocm.txt (for AMD), or requirements.txt (for CPU). Apple Silicon requires a specific multi-step installation.
  • Prerequisites: Python 3.10+, Git. Optional but recommended: NVIDIA GPU with CUDA, AMD GPU with ROCm (Linux), or Apple Silicon. Linux users may need libsndfile1 and ffmpeg.
  • Demo: A Google Colab notebook is available for instant testing without local installation.
  • Docs: Interactive API documentation is available at /docs after server startup.

Highlighted Details

  • OpenAI-compatible /v1/audio/speech endpoint.
  • Intelligent text chunking for audiobook-scale processing.
  • Voice cloning and predefined voice modes with generation seed for consistency.
  • Web UI with configuration management and session persistence.
  • Docker support with specific compose files for NVIDIA, AMD (ROCm), and CPU.

Maintenance & Community

The project is actively maintained by devnen. Community interaction and contributions are encouraged via GitHub issues and pull requests.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

AMD ROCm support is limited to Linux. While the server offers extensive troubleshooting, specific ROCm compatibility issues or older AMD GPU architectures might require manual configuration overrides. The "UI Cancel" button stops the frontend waiting but does not immediately halt backend inference.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
11
Star History
73 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.