AI audiobook toolkit for intelligent subtitle extraction, AI title generation, and translation
Top 69.8% on sourcepulse
LiberSonora is an AI-powered, open-source toolkit for audiobook processing, offering features like intelligent subtitle extraction, AI-driven title generation, and multilingual translation. It targets audiobook enthusiasts and developers seeking to automate and enhance their audiobook experience, providing local, offline processing with GPU acceleration.
How It Works
The toolkit leverages a modular architecture with distinct services for UI (Streamlit), audio denoising (ClearerVoice-Studio), and speech recognition/subtitle generation (FunASR). It integrates with various large language models (like Qwen2.5, MiniCPM) via Ollama for AI tasks, enabling local, private inference. This approach allows for flexible customization, including the use of custom LLMs, and ensures data security through entirely offline operation.
Quick Start & Requirements
docker-compose -f docker-compose.gpu.yml up -d
.docker-compose
are necessary.Highlighted Details
Maintenance & Community
The project is actively developed, with a roadmap outlining future phases including a cross-platform audiobook player. Feedback is encouraged via GitHub Issues.
Licensing & Compatibility
Limitations & Caveats
The project currently requires an NVIDIA GPU due to dependencies on ClearerVoice and FunASR; CPU support is a low-priority consideration. Some music players exhibit compatibility issues with the generated multilingual subtitles.
3 days ago
Inactive