Collection of speaker diarization papers
Top 90.7% on sourcepulse
This repository is a curated collection of academic papers on speaker diarization, a field focused on identifying "who spoke when" in audio recordings. It serves researchers and practitioners by providing a comprehensive overview of state-of-the-art techniques, datasets, and challenges in the domain.
How It Works
The repository organizes papers by methodology, including End-to-End Neural Diarization (EEND), clustering-based approaches, and methods incorporating speaker embeddings. It also categorizes papers by specific applications and challenges, such as multi-channel audio, online diarization, and integration with Automatic Speech Recognition (ASR). This structured approach allows users to easily navigate and discover relevant research.
Quick Start & Requirements
This is a curated list of papers, not a software library. No installation or specific requirements are needed beyond a web browser to access the linked papers.
Highlighted Details
Maintenance & Community
The repository is maintained by DongKeon and welcomes contributions via issues or pull requests for unnoticed documents. It links to other relevant "awesome" lists for speaker diarization.
Licensing & Compatibility
This repository contains links to academic papers. The licensing and compatibility of the individual papers are determined by their respective publishers and authors.
Limitations & Caveats
This repository is a bibliography and does not provide code or implementations. Users must access the linked papers independently, and availability may depend on publisher subscriptions or open access status.
2 months ago
Inactive