Discover and explore top open-source AI tools and projects—updated daily.
FinrandojinAI audiobook generator for voiceover production
New!
Top 84.1% on SourcePulse
A multi-voice AI audiobook generator, Alexandria transforms books into audiobooks using LLM-driven script annotation and Qwen3-TTS. It targets users needing automated, high-fidelity audiobook production with advanced voice customization, offering unique character voices, cloning, and fine-tuning capabilities.
How It Works
Alexandria employs an AI pipeline that first uses an LLM to parse book text into a structured JSON format, identifying speakers, dialogue, and TTS instructions. An optional second LLM pass refines annotations. The core is the Qwen3-TTS engine, which can run locally or remotely, synthesizing speech with per-line style control. Novelty lies in its comprehensive voice generation suite: cloning from short audio samples, designing voices from text descriptions, and persistent voice identity training via LoRA fine-tuning.
Quick Start & Requirements
Installation is recommended via Pinokio, or a Google Colab notebook is available. A separate OpenAI-compatible LLM server (e.g., LM Studio, Ollama) must be running. A GPU with 8 GB VRAM minimum (16 GB+ recommended) is advised for optimal performance, supporting NVIDIA (CUDA 12.8+) and AMD (ROCm 6.3+ on Linux). CPU mode is available but significantly slower. Requires ~20 GB disk space and 16 GB RAM recommended. Initial TTS model downloads (~3.5 GB per variant) occur on first use.
Highlighted Details
Maintenance & Community
The project notes a recent surge in user attention, which may lead to slower issue response times. No specific community channels (Discord/Slack) or prominent contributors are detailed in the README.
Licensing & Compatibility
Licensed under MIT, Alexandria is generally compatible with commercial use and closed-source linking without significant restrictions.
Limitations & Caveats
A separate LLM server is a mandatory prerequisite. AMD GPUs on Windows and Apple Silicon Macs are limited to CPU processing, resulting in substantially slower performance. Initial TTS model downloads require stable internet connectivity.
2 days ago
Inactive
RVC-Boss
CorentinJ