ASR-TTS-paper-daily  by halsay

Daily AI paper updates for ASR and TTS research

Created 1 year ago
329 stars

Top 83.0% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a daily curated list of recent research papers in Automatic Speech Recognition (ASR) and Text-to-Speech (TTS). It aims to keep researchers, engineers, and practitioners updated on the latest advancements in speech technology, providing a centralized resource for tracking new publications.

How It Works

The project functions as a dynamic, manually updated index of ASR and TTS research papers. It scrapes or manually collects information on papers, organizing them by publication date, title, authors, and availability of PDF and code. This approach ensures a focused and up-to-date overview of the field.

Quick Start & Requirements

  • Access: No installation required; access via the GitHub repository.
  • Requirements: A web browser and internet connection.
  • Links: The README provides direct links to papers and associated code where available.

Highlighted Details

  • Daily updates ensure the most current research is captured.
  • Comprehensive coverage of both ASR and TTS subfields.
  • Links to code repositories facilitate reproducibility and further research.
  • Organized by publication date for easy tracking of recent trends.

Maintenance & Community

The project is maintained by halsay, with updates appearing daily. Community interaction is primarily through GitHub's issue and pull request features.

Licensing & Compatibility

The repository itself, containing curated links and metadata, is typically under a permissive license like MIT, allowing for broad reuse. Individual papers retain their original licenses.

Limitations & Caveats

The primary limitation is that this is a curated list, not a functional tool. It relies on the availability and accuracy of information from external sources, and the inclusion of code links is dependent on authors making them public.

Health Check
Last Commit

23 hours ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
35 stars in the last 30 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Benjamin Bolte Benjamin Bolte(Cofounder of K-Scale Labs), and
3 more.

espnet by espnet

0.2%
9k
End-to-end speech processing toolkit for various speech tasks
Created 7 years ago
Updated 3 days ago
Starred by Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

GPT-SoVITS by RVC-Boss

0.3%
51k
Few-shot voice cloning and TTS web UI
Created 1 year ago
Updated 1 week ago
Feedback? Help us improve.