PyTorch implementation of FastSpeech 2 for text-to-speech
Top 22.0% on sourcepulse
This PyTorch implementation of FastSpeech 2 provides an end-to-end text-to-speech system capable of generating high-quality speech with controllable prosody. It targets researchers and developers in speech synthesis, offering support for English and Mandarin, single and multi-speaker models, and integration with popular vocoders like MelGAN and HiFi-GAN.
How It Works
This implementation follows the FastSpeech 2 architecture, utilizing F0 values as pitch features, differing from later versions that use continuous wavelet transform. It employs a non-autoregressive approach, enabling faster inference compared to models like Tacotron 2. The system allows fine-grained control over speaking rate, volume, and pitch by adjusting duration, energy, and pitch ratios during synthesis.
Quick Start & Requirements
pip3 install -r requirements.txt
output/ckpt/
subdirectories.python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step 900000 --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/model.yaml -t config/LJSpeech/train.yaml
Highlighted Details
Maintenance & Community
The project is based on xcmyz's FastSpeech implementation. Updates in 2021 added support for English and Mandarin, multi-speaker models, and vocoder integration. The repository is open for contributions and bug reports.
Licensing & Compatibility
The repository is available under a permissive license, allowing for use and modification. Specific license details are not explicitly stated in the README, but the phrasing "Feel free to use/modify the code" suggests broad compatibility for research and potentially commercial use, though verification is recommended.
Limitations & Caveats
This implementation is noted to use a Tacotron-2-styled Post-Net, which is not part of the original FastSpeech 2 paper. While it uses phoneme-level pitch and energy prediction for better prosody, this deviates from some later FastSpeech 2 variations. Alignment generation requires the Montreal Forced Aligner.
1 year ago
Inactive