survey by tts-tutorial

Survey paper on neural speech synthesis

Created 4 years ago

371 stars

Top 76.6% on SourcePulse

Project Summary

This repository contains a comprehensive survey paper on Neural Speech Synthesis (TTS), providing a structured overview of the field for researchers and industry practitioners. It covers key components like text analysis, acoustic models, and vocoders, alongside advanced topics such as fast, low-resource, robust, expressive, and adaptive TTS.

How It Works

The survey categorizes and analyzes various neural TTS approaches, detailing their data flow and evolution. It aims to consolidate current research, identify trends, and discuss future directions by examining different taxonomies and advanced topics within neural TTS.

Quick Start & Requirements

This repository is a survey paper and does not contain executable code or require installation. The primary resource is the PDF linked in the description: https://arxiv.org/pdf/2106.15561.pdf.

Highlighted Details

Comprehensive coverage of neural TTS components and advanced topics.
Summarizes relevant datasets and open-source implementations.
Discusses future research directions in the field.
Provides a structured taxonomy of TTS models and their evolution.

Maintenance & Community

The survey paper is authored by Xu Tan, Tao Qin, Frank Soong, and Tie-Yan Liu from Microsoft Research Asia. Comments and suggestions are welcomed.

Licensing & Compatibility

The licensing information for the survey paper itself is not specified in the README. It is presented as an academic resource.

Limitations & Caveats

As a survey paper, it does not offer a runnable system or code. Its value lies in its comprehensive review of the literature, which may become dated as the field rapidly evolves.

survey by tts-tutorial

Explore Similar Projects

DailyTalk by keonlee9420

pheme by PolyAI-LDN

ASR-TTS-paper-daily by halsay

DiffGAN-TTS by keonlee9420

TTS-papers by coqui-ai

FireRedTTS by FireRedTeam

GLM-TTS by zai-org

speech-synthesis-paper by wenet-e2e

IMS-Toucan by DigitalPhonetics

mlx-audio by Blaizzy

tortoise-tts by neonbjb

GPT-SoVITS by RVC-Boss