cosyvoice-api by jianchang512

API for text-to-speech using CosyVoice

Created 1 year ago

331 stars

Top 82.8% on SourcePulse

Project Summary

This project provides a Python API wrapper for the CosyVoice2 text-to-speech model, enabling developers to integrate advanced voice synthesis capabilities into their applications. It targets developers and researchers working with AI voice generation, offering flexible options for both built-in voice synthesis and voice cloning.

How It Works

The API exposes several endpoints for different synthesis tasks. The /tts endpoint handles basic text-to-speech with predefined roles (languages/genders). The /clone_eq endpoint performs voice cloning using a reference audio and matching text, while /cone allows cross-lingual voice cloning. It also offers an OpenAI-compatible /v1/audio/speech endpoint for seamless integration with existing OpenAI TTS workflows.

Quick Start & Requirements

Install Flask: python -m pip install flask
Run the API: python api.py
Requires a pre-deployed CosyVoice2 instance.
Refer to the CosyVoice2 project for core model requirements.

Highlighted Details

Supports multiple synthesis modes: built-in voices, same-language voice cloning, and cross-lingual voice cloning.
Provides an OpenAI-compatible API endpoint for easier integration.
Allows specifying text, reference audio, and reference text for cloning.

Maintenance & Community

Project maintained by jianchang512.
No explicit community links (Discord, Slack) or roadmap are provided in the README.

Licensing & Compatibility

The README does not specify a license.
Compatibility for commercial use or closed-source linking is undetermined without a license.

Limitations & Caveats

The project relies on a pre-existing CosyVoice2 deployment, and its own licensing is not specified, which may impact commercial adoption. The README does not detail error handling or advanced configuration options.

cosyvoice-api by jianchang512

Explore Similar Projects

MahaTTS by dubverse-ai

open-dubbing by Softcatala

echogarden by echogarden-project

xtts2-ui by BoltzmannEntropy

Open-VoiceCanvas by ItusiAI

FireRedTTS by FireRedTeam

sesame_csm_openai by phildougherty

local-talking-llm by vndee

LanguageLeapAI by SociallyIneptWeeb

easyVoice by cosin2077

VALL-E-X by Plachtaa

OpenVoice by myshell-ai