sesame_csm_openai  by phildougherty

OpenAI-compatible TTS API for voice cloning

Created 5 months ago
390 stars

Top 73.6% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an OpenAI-compatible Text-to-Speech (TTS) API using the Sesame CSM-1B model, enabling high-quality voice generation and cloning. It targets developers and users of AI chat platforms like OpenWebUI, offering consistent voices and custom voice creation from audio files or YouTube videos.

How It Works

The API leverages the CSM-1B model for speech synthesis, which uses acoustic "seed" samples to maintain voice consistency across requests. It supports multiple audio formats and offers CUDA acceleration for faster generation. Voice cloning is achieved by processing user-provided audio samples or YouTube segments, creating unique voice IDs for subsequent TTS generation.

Quick Start & Requirements

  • Install/Run: docker compose up -d --build
  • Prerequisites: Docker, Docker Compose, NVIDIA GPU with CUDA, Hugging Face account with access to sesame/csm-1b.
  • Setup: Requires Hugging Face token in .env file. First startup downloads models and may take time.
  • Docs: OpenAI TTS API, Voice Cloning UI

Highlighted Details

  • OpenAI API compatibility for seamless integration.
  • Voice cloning from local files or YouTube URLs.
  • Supports MP3, OPUS, AAC, FLAC, WAV formats.
  • Multi-GPU support via CSM_DEVICE_MAP environment variable.

Maintenance & Community

  • MIT License for the API. CSM-1B model subject to Sesame's license.
  • Not affiliated with Sesame or OpenAI.

Licensing & Compatibility

  • MIT License for the API.
  • CSM-1B model license terms apply. Compatible with commercial use, provided Sesame's model license is adhered to.

Limitations & Caveats

Voice cloning quality depends heavily on the input audio quality and clarity. YouTube cloning may yield lower quality with noisy sources or background music. The README notes potential voice drift with long pauses between requests.

Health Check
Last Commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.