epub2tts by aedocw

CLI tool to create audiobooks from epub/text files using TTS engines

Created 2 years ago

876 stars

Top 41.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Sasha Rush

Research Scientist at Cursor; Professor at Cornell Tech

Project Summary

This project provides a Python application to convert EPUB or text files into audiobooks using various Text-to-Speech (TTS) engines, including Coqui AI TTS, OpenAI, and Microsoft Edge. It's designed for users who want to create audio versions of their digital books with features like automatic chapter detection, cover art embedding, and voice cloning.

How It Works

The application parses EPUB files to extract text and chapter structures, or directly processes plain text files. It then leverages different TTS engines, supporting Coqui AI's XTTS for high-quality voice cloning and studio voices, Microsoft Edge for free cloud-based TTS, and OpenAI. Advanced features like multiprocessing for faster chapter processing and DeepSpeed for GPU acceleration are integrated.

Quick Start & Requirements

Install: pip install . (after cloning the repository and setting up Python 3.11).
Prerequisites: Python 3.11, ffmpeg, espeak-ng. For GPU acceleration, CUDA toolkit is recommended. macOS users require Homebrew for espeak, pyenv, ffmpeg, and mecab. Windows users need Microsoft C++ Build Tools and Chocolatey.
Setup: Installation involves cloning the repo, setting up a Python environment, and installing dependencies. Estimated setup time varies by OS and hardware, potentially 30-60 minutes for full dependency installation.
Docs: Usage Instructions

Highlighted Details

Supports multiple TTS engines: Coqui AI (VITS, XTTS), OpenAI, MS Edge, and Kokoro.
Features voice cloning with XTTS and offers 58 studio-quality voices from Coqui AI.
Includes multiprocessing (--threads N) for parallel chapter processing and optional DeepSpeed for GPU acceleration.
Can embed cover art, detect chapters automatically, and resume interrupted processes.

Maintenance & Community

The project has seen recent contributions improving refactoring, multiprocessing, and NCX file support. Discussions and issues are welcomed on the GitHub repository.

Licensing & Compatibility

The project is licensed under the MIT License, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

The Docker image is noted as not recently updated or tested and may not reliably utilize GPUs. The Kokoro engine's speed parameter is currently not functional. EPUB files must be DRM-free.

Health Check

Last Commit

4 weeks ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days