voice_datasets  by jim-schwoebel

Voice dataset list for voice/sound computing

created 6 years ago
1,979 stars

Top 22.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive, curated list of over 95 open-source datasets for voice and sound computing, targeting researchers and developers in speech recognition, emotion detection, and audio analysis. It serves as a centralized resource to discover and access diverse audio data, accelerating development in the field.

How It Works

The project acts as a directory, categorizing datasets into "Speech datasets" and "Audio events and music datasets." Each entry includes a brief description, size, speaker/actor information, emotional categories, and specific use cases like ASR, emotion recognition, or source separation. This structured approach simplifies dataset discovery for various audio processing tasks.

Quick Start & Requirements

  • Access: Datasets are linked directly from the README. Download commands or links are provided per dataset.
  • Prerequisites: Varies by dataset; common requirements include sufficient disk space, internet bandwidth, and potentially specific audio processing libraries for local use.
  • Resources: Dataset sizes range from megabytes to hundreds of gigabytes.

Highlighted Details

  • Extensive coverage of speech datasets, including emotional speech, noisy environments, and diverse accents.
  • Includes datasets for audio event detection, music analysis, and environmental sound classification.
  • Links to related projects like "Voice Computing in Python" and the "Allie" framework.
  • Mentions datasets suitable for specific tasks like wake word detection and speaker identification.

Maintenance & Community

The repository is maintained by jim-schwoebel. Feedback and new dataset suggestions are welcomed via a provided link.

Licensing & Compatibility

Dataset licenses vary; users must consult individual dataset licenses for usage restrictions. The repository itself is likely under a permissive license, but the datasets it lists have diverse licensing.

Limitations & Caveats

The repository is a curated list, not a data provider; users must download datasets individually. Some datasets may have specific non-commercial use restrictions or require payment (e.g., TIMIT).

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
79 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

awesome-diarization by wq2012

0.2%
2k
List of resources for speaker diarization
created 6 years ago
updated 1 week ago
Feedback? Help us improve.