Chatterbox-TTS-Server by devnen

Self-host a powerful TTS server with a web UI and API

Created 9 months ago

1,040 stars

Top 35.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Abubakar Abid

Cofounder of Gradio

Project Summary

This project provides a self-hostable server for the Chatterbox TTS model, offering a user-friendly web UI and an OpenAI-compatible API. It targets developers and users needing to generate high-quality speech, perform voice cloning, and process large text volumes for applications like audiobook creation, with accelerated performance on NVIDIA, AMD, and Apple Silicon hardware.

How It Works

The server leverages the Chatterbox TTS engine, enhanced with a FastAPI backend for robust API and UI functionality. It intelligently chunks long text inputs based on sentence structure for seamless audio concatenation, supports voice cloning via reference audio, and offers predefined voices for consistent output. Generation consistency is further improved by an optional seed parameter.

Quick Start & Requirements

Installation: Clone the repository, create a Python virtual environment, and install dependencies using pip install -r requirements-nvidia.txt (for NVIDIA), requirements-rocm.txt (for AMD), or requirements.txt (for CPU). Apple Silicon requires a specific multi-step installation.
Prerequisites: Python 3.10+, Git. Optional but recommended: NVIDIA GPU with CUDA, AMD GPU with ROCm (Linux), or Apple Silicon. Linux users may need libsndfile1 and ffmpeg.
Demo: A Google Colab notebook is available for instant testing without local installation.
Docs: Interactive API documentation is available at /docs after server startup.

Highlighted Details

OpenAI-compatible /v1/audio/speech endpoint.
Intelligent text chunking for audiobook-scale processing.
Voice cloning and predefined voice modes with generation seed for consistency.
Web UI with configuration management and session persistence.
Docker support with specific compose files for NVIDIA, AMD (ROCm), and CPU.

Maintenance & Community

The project is actively maintained by devnen. Community interaction and contributions are encouraged via GitHub issues and pull requests.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

AMD ROCm support is limited to Linux. While the server offers extensive troubleshooting, specific ROCm compatibility issues or older AMD GPU architectures might require manual configuration overrides. The "UI Cancel" button stops the frontend waiting but does not immediately halt backend inference.

Health Check

Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

68 stars in the last 30 days