CLI tool to create audiobooks from epub/text files using TTS engines
Top 44.0% on sourcepulse
This project provides a Python application to convert EPUB or text files into audiobooks using various Text-to-Speech (TTS) engines, including Coqui AI TTS, OpenAI, and Microsoft Edge. It's designed for users who want to create audio versions of their digital books with features like automatic chapter detection, cover art embedding, and voice cloning.
How It Works
The application parses EPUB files to extract text and chapter structures, or directly processes plain text files. It then leverages different TTS engines, supporting Coqui AI's XTTS for high-quality voice cloning and studio voices, Microsoft Edge for free cloud-based TTS, and OpenAI. Advanced features like multiprocessing for faster chapter processing and DeepSpeed for GPU acceleration are integrated.
Quick Start & Requirements
pip install .
(after cloning the repository and setting up Python 3.11).ffmpeg
, espeak-ng
. For GPU acceleration, CUDA toolkit is recommended. macOS users require Homebrew for espeak
, pyenv
, ffmpeg
, and mecab
. Windows users need Microsoft C++ Build Tools and Chocolatey.Highlighted Details
--threads N
) for parallel chapter processing and optional DeepSpeed for GPU acceleration.Maintenance & Community
The project has seen recent contributions improving refactoring, multiprocessing, and NCX file support. Discussions and issues are welcomed on the GitHub repository.
Licensing & Compatibility
The project is licensed under the MIT License, allowing for commercial use and integration with closed-source projects.
Limitations & Caveats
The Docker image is noted as not recently updated or tested and may not reliably utilize GPUs. The Kokoro engine's speed parameter is currently not functional. EPUB files must be DRM-free.
1 month ago
1+ week