tts by wangwangit

AI platform for seamless voice and text processing

Created 6 months ago

393 stars

Top 73.4% on SourcePulse

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> VoiceCraft is an AI-driven, free, open-source platform for bidirectional voice processing, integrating Microsoft Edge TTS (20+ Chinese voices) and SiliconFlow STT. It targets users and developers needing efficient voice conversion, offering a responsive web UI and OpenAI-compatible APIs deployed via Cloudflare Workers for global, zero-configuration access.

How It Works

Deployed on Cloudflare Workers for edge computing, VoiceCraft uses Microsoft Edge TTS for high-quality speech synthesis with customizable parameters (speed, pitch, style) and the SiliconFlow API for accurate speech-to-text transcription. A unified web interface allows seamless switching between TTS and STT modes, complemented by RESTful APIs mimicking OpenAI's format for programmatic integration.

Quick Start & Requirements

Web Usage: Access https://tts.wangwangit.com.
Local Deployment: Clone repo, install Wrangler CLI (npm install -g wrangler), run wrangler dev.
Prerequisites: Node.js/npm for local deployment. Audio uploads limited to 10MB.
Links: Demo: https://tts.wangwangit.com.

Highlighted Details

Bidirectional Processing: Handles both Text-to-Speech and Speech-to-Text seamlessly.
OpenAI TTS API Compatibility: Offers a RESTful API mimicking OpenAI's format.
Serverless Edge Deployment: Cloudflare Workers ensure global distribution, high availability, and zero-configuration.
Extensive Chinese Voices: Over 20 distinct Chinese TTS voices with adjustable parameters.

Maintenance & Community

Contribution: Accepts Issues and Pull Requests.
Community: WeChat public account "一只会飞的旺旺" for updates and support. No Discord/Slack mentioned.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive MIT license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

STT Token: Programmatic STT may require a custom SiliconFlow API token; the web UI uses a default.
Audio Upload Limit: STT audio files are capped at 10MB.
External API Dependency: Relies on Microsoft Edge TTS and SiliconFlow, subject to their terms and availability.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

2

Star History

34 stars in the last 30 days

Explore Similar Projects

Starred by

Travis Fischer

Travis Fischer(Founder of Agentic).

echogarden by echogarden-project

Cross-platform speech toolset for command-line or Node.js use

Created 2 years ago

Updated 5 months ago

curses by mmpneo

App for real-time speech-to-text captions across platforms

Created 3 years ago

Updated 1 year ago

Auralis by astramind-ai

TTS engine for fast voice cloning

Created 1 year ago

Updated 1 year ago

SonicVale by xcLee001

AI voice generation platform for diverse content

Created 5 months ago

Updated 2 weeks ago

Starred by

Dan Guido

Dan Guido(Cofounder of Trail of Bits) and

Michael Han

Michael Han(Cofounder of Unsloth).

FluidVoice by altic-dev

macOS app for local voice-to-text transcription with AI enhancement

Created 5 months ago

Updated 3 days ago

Open-VoiceCanvas by ItusiAI

Open-source text-to-speech (TTS) platform with Stripe payment support

Created 1 year ago

Updated 1 month ago

tts by zuoban

TTS service for voice synthesis using Microsoft Azure

Created 1 year ago

Updated 3 weeks ago

LanguageLeapAI by SociallyIneptWeeb

Real-time AI translator for cross-lingual online communication

Created 3 years ago

Updated 2 years ago

AI-Waifu-Vtuber by ardha27

AI Vtuber assistant for streaming

Created 3 years ago

Updated 1 year ago

Starred by

Abubakar Abid

Abubakar Abid(Cofounder of Gradio).

Chatterbox-TTS-Server by devnen

Self-host a powerful TTS server with a web UI and API

Created 9 months ago

Updated 1 week ago

easyVoice by cosin2077

Text-to-speech tool for long texts and multi-character dubbing

Created 11 months ago

Updated 1 month ago

Starred by

Abubakar Abid

Abubakar Abid(Cofounder of Gradio).

voice-pro by abus-aikorea

WebUI for speech recognition, translation, and dubbing

Created 1 year ago

Updated 2 months ago

Feedback? Help us improve.