cosyvoice-api  by jianchang512

API for text-to-speech using CosyVoice

Created 1 year ago
309 stars

Top 86.9% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a Python API wrapper for the CosyVoice2 text-to-speech model, enabling developers to integrate advanced voice synthesis capabilities into their applications. It targets developers and researchers working with AI voice generation, offering flexible options for both built-in voice synthesis and voice cloning.

How It Works

The API exposes several endpoints for different synthesis tasks. The /tts endpoint handles basic text-to-speech with predefined roles (languages/genders). The /clone_eq endpoint performs voice cloning using a reference audio and matching text, while /cone allows cross-lingual voice cloning. It also offers an OpenAI-compatible /v1/audio/speech endpoint for seamless integration with existing OpenAI TTS workflows.

Quick Start & Requirements

  • Install Flask: python -m pip install flask
  • Run the API: python api.py
  • Requires a pre-deployed CosyVoice2 instance.
  • Refer to the CosyVoice2 project for core model requirements.

Highlighted Details

  • Supports multiple synthesis modes: built-in voices, same-language voice cloning, and cross-lingual voice cloning.
  • Provides an OpenAI-compatible API endpoint for easier integration.
  • Allows specifying text, reference audio, and reference text for cloning.

Maintenance & Community

  • Project maintained by jianchang512.
  • No explicit community links (Discord, Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • The README does not specify a license.
  • Compatibility for commercial use or closed-source linking is undetermined without a license.

Limitations & Caveats

The project relies on a pre-existing CosyVoice2 deployment, and its own licensing is not specified, which may impact commercial adoption. The README does not detail error handling or advanced configuration options.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), and
6 more.

OpenVoice by myshell-ai

0.2%
34k
Audio foundation model for versatile, instant voice cloning
Created 1 year ago
Updated 5 months ago
Feedback? Help us improve.