Discover and explore top open-source AI tools and projects—updated daily.
andimarafiotiReal-time TTS inference acceleration
New!
Top 63.6% on SourcePulse
Summary
This project accelerates Qwen3-TTS, a text-to-speech model, by leveraging CUDA graphs for real-time inference. It targets developers and researchers requiring low-latency, high-throughput speech generation, offering significant performance improvements over standard PyTorch implementations without relying on external libraries like vLLM or Triton. The primary benefit is achieving faster-than-real-time (RTF) audio synthesis, enabling applications like live voice assistants or interactive content generation.
How It Works
The core innovation lies in capturing the entire Qwen3-TTS decode step—comprising the Talker and Code Predictor transformers—into a single torch.cuda.CUDAGraph. This eliminates the overhead of numerous small CUDA kernel launches and Python interpreter calls per step, replaying the entire sequence as one optimized GPU operation. It employs a static KV cache with padded attention to manage variable-length sequences within fixed-size tensors, contrasting with the original dynamic cache approach. This static capture and replay mechanism is key to its performance gains.
Quick Start & Requirements
pip install faster-qwen3-ttspip install -e ".[demo]" and run python demo/server.py.Highlighted Details
Maintenance & Community
No specific details regarding maintainers, community channels (like Discord/Slack), or project roadmap were found in the provided text.
Licensing & Compatibility
Limitations & Caveats
Numerical differences may exist between the static cache (CUDA graphs) and the original dynamic cache implementations due to varying CUDA kernel paths and reduction orders, although perceptual parity is maintained through testing. The advanced ICL voice cloning mode might exhibit minor artifacts at the start of generated speech if the reference audio ends abruptly, though a silence-padding fix is applied by default.
3 days ago
Inactive
antirez
neonbjb