audiblez  by santinic

Generate audiobooks from e-books

Created 8 months ago
5,527 stars

Top 9.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project addresses the need for converting e-books into audiobooks, offering a solution for users who prefer listening to reading. It targets individuals looking to consume digital books in an audiobook format, providing a convenient way to create personalized audio versions of their e-books. The primary benefit is the ability to transform static e-book content into a portable and accessible audiobook format.

How It Works

Audiblez utilizes the Kokoro-82M text-to-speech model to synthesize natural-sounding speech from e-book content. It processes e-books, typically in EPUB format, breaking them down into chapters or sections. The synthesized audio is then compiled into an M4B audiobook file. The project highlights its use of CUDA for GPU acceleration, significantly speeding up the conversion process on compatible hardware.

Quick Start & Requirements

  • Installation: pip install audiblez
  • Prerequisites: ffmpeg and espeak-ng must be installed on the system. For GUI: pillow and wxpython. For CUDA support, PyTorch must be installed with CUDA support.
  • Setup: Installation is straightforward via pip. Conversion times vary significantly between CPU (approx. 60 chars/sec) and GPU (approx. 600 chars/sec).
  • Links: Kokoro-82M voices, PyTorch CUDA installation

Highlighted Details

  • Supports multiple languages including English (US/UK), Spanish, French, Hindi, Italian, Japanese, Brazilian Portuguese, and Mandarin Chinese.
  • Offers adjustable speech speed (0.5x to 2.0x).
  • Provides both a command-line interface and a graphical user interface (GUI).
  • CUDA support enables significantly faster processing on NVIDIA GPUs.

Maintenance & Community

The project was authored by Claudio Santini in 2025. Further community or maintenance details are not specified in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: The MIT license generally permits commercial use and linking with closed-source projects. However, it notes that Apple Silicon is not currently supported due to the lack of a Kokoro implementation in MLX.

Limitations & Caveats

The project explicitly states that Apple Silicon is not supported at this time. While it mentions CUDA support for NVIDIA GPUs, performance on other hardware accelerators is not detailed. The quality and naturalness of the speech are dependent on the Kokoro-82M model, which was trained on less than 100 hours of audio.

Health Check
Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
4
Star History
1,223 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.