TTS-papers by coqui-ai

Collection of TTS research papers

Created 6 years ago

731 stars

Top 46.4% on SourcePulse

Project Summary

This repository serves as a curated collection of research papers and summaries related to Text-to-Speech (TTS) synthesis. It aims to provide engineers and researchers with a centralized resource for understanding the evolution and various approaches in TTS technology, from foundational models to recent advancements.

How It Works

The repository organizes papers by key TTS concepts such as phoneme/character representations, transfer learning, attention mechanisms, non-autoregressive models, multi-speaker synthesis, and vocoders. Each entry typically includes a link to the paper, a brief summary of its core methodology, and sometimes personal insights or experimental observations.

Highlighted Details

Covers a wide range of TTS architectures including Tacotron, FastSpeech, Glow-TTS, and GAN-based approaches.
Details various techniques for alignment, duration prediction, and speaker adaptation.
Includes summaries of numerous vocoder models like WaveNet, MelGAN, and WaveGlow.
Features papers on multi-lingual and few-shot TTS adaptation.

Maintenance & Community

This repository appears to be a static collection of links and summaries, with no active development or community interaction explicitly mentioned.

Licensing & Compatibility

The repository itself does not contain code and is a collection of links to external research papers. The licensing of the linked papers would be governed by their respective publishers.

Limitations & Caveats

This repository is a curated list of papers and does not provide runnable code or implementations. The summaries are subjective and may not cover all nuances of the original research. Some entries include personal opinions or "2 cents" which should be considered as such.

TTS-papers by coqui-ai

Explore Similar Projects

ASR-TTS-paper-daily by halsay

survey by tts-tutorial

TTS-arxiv-daily by liutaocode

voicebox-pytorch by lucidrains

open-tts-tracker by Vaibhavs10

parrots by shibing624

speech-synthesis-paper by wenet-e2e

IMS-Toucan by DigitalPhonetics

voice-pro by abus-aikorea

index-tts by index-tts

TTS by coqui-ai

GPT-SoVITS by RVC-Boss