speech-synthesis-paper  by wenet-e2e

Speech synthesis papers list

Created 5 years ago
1,057 stars

Top 35.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository is a curated list of academic papers on speech synthesis, targeting researchers and engineers in the field. It provides a structured overview of key advancements and methodologies in text-to-speech (TTS) technology, enabling users to quickly identify foundational and state-of-the-art research.

How It Works

The list categorizes papers across various sub-fields of speech synthesis, including TTS frontends, acoustic models (autoregressive and non-autoregressive), vocoders, and specialized areas like expressive TTS and voice conversion. Papers are annotated with a star (★) indicating over 50 citations, serving as a guide for beginners to prioritize influential works.

Highlighted Details

  • Comprehensive categorization of TTS research, from foundational models like Tacotron and WaveNet to recent diffusion-based approaches.
  • Includes papers on specialized TTS tasks such as expressive speech, multi-speaker synthesis, and voice conversion.
  • Papers are marked with citations counts (★) to highlight highly influential works.
  • Covers a wide range of techniques including autoregressive, non-autoregressive, flow-based, GAN-based, and diffusion-based models.

Maintenance & Community

This is a community-driven list, welcoming recommendations for new papers. It references other curated lists like "awesome-speech-recognition-speech-synthesis-papers" and "awesome-tts-samples."

Licensing & Compatibility

The repository itself does not contain code, only a list of papers. Licensing information would pertain to the individual papers or their associated code repositories, which are not directly hosted here.

Limitations & Caveats

The list is a compilation of research papers and does not provide implementations, code, or datasets. The "★" citation count is a manual annotation and may not be exhaustive or perfectly up-to-date.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

GPT-SoVITS by RVC-Boss

0.3%
51k
Few-shot voice cloning and TTS web UI
Created 1 year ago
Updated 1 week ago
Feedback? Help us improve.