Text-to-speech tool for long texts and multi-character dubbing
Top 32.3% on sourcepulse
EasyVoice is an open-source text-to-speech (TTS) solution designed for converting long text, such as novels, into high-quality audiobooks with support for multi-character narration. It targets users who need to generate audio content from extensive text, offering features like streaming playback, automatic subtitle generation, and AI-driven voice recommendations for different characters.
How It Works
The system leverages Microsoft Azure TTS (Edge-TTS API) and OpenAI-compatible TTS services for speech synthesis. It supports streaming to handle arbitrarily long texts and allows for custom voice parameters like rate, volume, and pitch. An AI component analyzes text segments to recommend suitable voices and configurations, enabling multi-character narration. The architecture comprises a Vue 3 frontend and a Node.js backend.
Quick Start & Requirements
docker run -d -p 3000:3000 -v $(pwd)/audio:/app/audio cosincox/easyvoice:latest
pnpm i -r
, then pnpm dev:root
(development) or pnpm build:root
and pnpm start:root
(production).audio
directory (Docker) or ./packages/backend/audio
(local).Highlighted Details
Maintenance & Community
The project is actively maintained by cosin2077. Future plans include integrating official TTS APIs, Google TTS, and voice cloning.
Licensing & Compatibility
The project is released under the MIT License, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
AI-recommended voice quality is dependent on the underlying large language model's capabilities. The AI recommendation process can be slower than direct TTS generation. Rate limiting and concurrency limits for the Edge-TTS API may apply.
2 months ago
1 day