xtts-api-server by daswer123

FastAPI server for XTTSv2 text-to-speech

Created 2 years ago

569 stars

Top 56.5% on SourcePulse

Project Summary

This project provides a FastAPI server for the XTTSv2 text-to-speech model, targeting users who need a programmatic interface for voice generation, particularly those integrating with applications like SillyTavern. It offers a flexible way to leverage XTTSv2's capabilities, including voice cloning and multi-language support, with options for performance optimization.

How It Works

The server wraps the XTTSv2 model within a FastAPI framework, exposing endpoints for text-to-speech generation. It supports loading models locally or via an API, with options to specify model versions. Performance can be enhanced using --deepspeed for multi-GPU acceleration or --lowvram for reduced memory footprint. Streaming mode is available for near real-time audio output, with an improved version for complex languages.

Quick Start & Requirements

Install via pip: pip install xtts-api-server
GPU acceleration recommended: pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118 (CUDA 11.8 required).
Linux users may need sudo apt install -y python3-dev python3-venv portaudio19-dev.
Server launch: python -m xtts_api_server
Docker images are available.
API Docs: http://localhost:8020/docs

Highlighted Details

Supports voice cloning from provided audio samples.
Offers --deepspeed for 2-3x processing speedup.
--streaming-mode provides near real-time audio playback.
Can load custom XTTSv2 models from a local folder.

Maintenance & Community

The project acknowledges contributions from Kolja Beigel (RealtimeTTS), erew123, and lendot. The author notes limited time for active development, suggesting users explore similar projects.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Code usage is permitted for personal needs, and PRs are welcome.

Limitations & Caveats

Streaming mode has limitations, including only working locally and not supporting the tts_to_file endpoint. The author advises users to check a similar project for alternative XTTS implementations.

xtts-api-server by daswer123

Explore Similar Projects

whispering-ui by Sharrnah

csm-mlx by senstella

S.A.T.U.R.D.A.Y by GRVYDEV

Auralis by astramind-ai

sesame_csm_openai by phildougherty

realtime-transcription-fastrtc by sofdog-gh

Chatterbox-TTS-Extended by petermg

sherpa-ncnn by k2-fsa

alltalk_tts by erew123

seed-vc by Plachtaa

KittenTTS by KittenML

piper by rhasspy