Discover and explore top open-source AI tools and projects—updated daily.
Speech datasets for recognition and synthesis
Top 69.5% on SourcePulse
This repository serves as a curated list of speech datasets, primarily for Chinese, English, Japanese, Korean, Russian, French, Spanish, and Turkish languages. It categorizes datasets by their application, including speech recognition, speech synthesis, speaker recognition, speaker diarization, and voice activity detection. The primary benefit is providing a centralized, organized reference for researchers and developers seeking speech data for various AI tasks.
How It Works
The repository presents a comprehensive table of datasets, detailing their names, durations in hours, download addresses (primarily OpenSLR, Hugging Face, and specific project pages), and remarks on their content or application. It is structured to allow users to quickly find relevant datasets based on language, task, and data size.
Quick Start & Requirements
No installation or specific requirements are mentioned, as this is a reference list. Users are directed to the provided URLs to access and download the datasets.
Highlighted Details
Maintenance & Community
Information regarding maintainers, community channels, or specific update frequency is not provided in the README.
Licensing & Compatibility
Dataset licenses vary and are not explicitly stated here; users must refer to the individual dataset links for licensing details. Compatibility for commercial use depends on each dataset's specific license.
Limitations & Caveats
The README does not provide direct download links or scripts, requiring users to navigate to external sites. Some dataset entries have missing duration information or remarks, and the availability of "if available" datasets is not guaranteed. The "Free ST Chinese Mandarin Corpus" is listed under English datasets, which may be a categorization error.
8 months ago
Inactive