Discover and explore top open-source AI tools and projects—updated daily.
resemble-aiAI-powered expressive TTS with voice cloning
Top 72.4% on SourcePulse
Summary
DramaBox offers a highly expressive Text-to-Speech (TTS) system capable of voice cloning, built upon Lightricks' LTX-2.3 audio model. It targets developers and researchers seeking fine-grained control over synthesized speech, enabling nuanced emotional delivery, speaker identity, and stylistic variations through natural language prompting. The primary benefit is the ability to generate human-like, contextually rich audio content with unprecedented prompt-based control.
How It Works
This project is an IC-LoRA fine-tune of the LTX-2.3 3.3B audio-only model. Its core innovation lies in prompt-driven TTS, where detailed natural language prompts dictate speaker identity, emotion, delivery style, and even non-verbal sounds like laughs and sighs. An optional 10-second voice reference allows for timbre cloning. This approach offers a novel way to control speech synthesis nuances directly through descriptive text.
Quick Start & Requirements
src/inference_server.py, src/inference.py, app.py).Highlighted Details
--no-watermark).Maintenance & Community
Developed by Resemble AI, with significant contributions acknowledged from the Lightricks team for the base LTX-2.3 model. No specific community channels (e.g., Discord, Slack) or detailed roadmap information are provided in the README.
Licensing & Compatibility
Distributed under the LTX-2 Community License, which is derived from the LTX-2.3 base model license. Users must consult the LICENSE file for specific terms, particularly regarding commercial use and redistribution, as community licenses can impose restrictions.
Limitations & Caveats
The system requires substantial VRAM (~24 GB peak), potentially limiting accessibility on consumer hardware. The LTX-2 Community License necessitates careful review for commercial deployment. The project notes that pre-merged checkpoints have yielded degraded output in their testing, recommending inference with LoRA loaded separately.
4 days ago
Inactive