Discover and explore top open-source AI tools and projects—updated daily.
PyTorch implementation for fast diffusion text-to-speech
Top 68.5% on SourcePulse
ProDiff offers an extremely fast, high-fidelity text-to-speech (TTS) pipeline for industrial deployment, leveraging conditional diffusion probabilistic models. It targets researchers and developers seeking efficient and high-quality speech synthesis solutions.
How It Works
ProDiff utilizes a two-stage approach: ProDiff (acoustic model) and FastDiff (neural vocoder). This combination allows for progressive diffusion, enabling rapid synthesis by controlling the number of reverse sampling steps in both models. This design prioritizes speed without significantly compromising speech quality, making it suitable for real-time or near-real-time applications.
Quick Start & Requirements
snapshot_download
from Hugging Face Hub and move them to the checkpoints/
directory.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 years ago
Inactive