TransformerTTS  by spring-media

TensorFlow 2 implementation for non-autoregressive text-to-speech

created 5 years ago
1,150 stars

Top 34.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a TensorFlow 2 implementation of a non-autoregressive Transformer model for Text-to-Speech (TTS). It aims to deliver fast, robust, and controllable speech synthesis, suitable for researchers and developers working on TTS systems.

How It Works

The core of the project is a non-autoregressive Transformer architecture, inspired by FastSpeech and FastPitch. This approach avoids sequential generation, leading to faster inference, improved robustness against repeats and attention failures, and explicit control over speech speed and pitch. The model generates mel-spectrograms, which are then converted to audio waveforms using external vocoders like MelGAN or HiFiGAN.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python >= 3.6, espeak (install via apt-get or brew).
  • Pre-trained LJSpeech model available for quick inference.
  • Official Colab notebook available for trying out the model.

Highlighted Details

  • Non-autoregressive Transformer for fast and robust TTS.
  • Supports pitch prediction and controllable speech speed.
  • Compatible with MelGAN and HiFiGAN vocoders for waveform generation.
  • Includes scripts for training aligner, TTS models, and extracting durations.

Maintenance & Community

  • Maintained by Francesco Cardinale.
  • Mentions collaboration with the Mozilla TTS team.
  • No explicit links to community channels (Discord/Slack) or roadmaps are provided in the README.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README text.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README mentions that specific pre-trained model weights require checking out correct repository versions for API compatibility. Support for WaveRNN has been discontinued.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.