Awesome-Speaker-Diarization  by DongKeon

Collection of speaker diarization papers

created 2 years ago
295 stars

Top 90.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated collection of academic papers on speaker diarization, a field focused on identifying "who spoke when" in audio recordings. It serves researchers and practitioners by providing a comprehensive overview of state-of-the-art techniques, datasets, and challenges in the domain.

How It Works

The repository organizes papers by methodology, including End-to-End Neural Diarization (EEND), clustering-based approaches, and methods incorporating speaker embeddings. It also categorizes papers by specific applications and challenges, such as multi-channel audio, online diarization, and integration with Automatic Speech Recognition (ASR). This structured approach allows users to easily navigate and discover relevant research.

Quick Start & Requirements

This is a curated list of papers, not a software library. No installation or specific requirements are needed beyond a web browser to access the linked papers.

Highlighted Details

  • Extensive coverage of End-to-End Neural Diarization (EEND) techniques, including BLSTM-EEND, SA-EEND, EEND-EDA, CB-EEND, and Transformer-based models.
  • Detailed sections on related tasks like speaker recognition, speech separation, and language diarization.
  • Comprehensive lists of papers from major challenges such as VoxSRC, DIHARD, and MISP, often including winning system descriptions.
  • Inclusion of papers on novel approaches like multimodal diarization (audio-visual) and LLM-based post-processing.

Maintenance & Community

The repository is maintained by DongKeon and welcomes contributions via issues or pull requests for unnoticed documents. It links to other relevant "awesome" lists for speaker diarization.

Licensing & Compatibility

This repository contains links to academic papers. The licensing and compatibility of the individual papers are determined by their respective publishers and authors.

Limitations & Caveats

This repository is a bibliography and does not provide code or implementations. Users must access the linked papers independently, and availability may depend on publisher subscriptions or open access status.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
21 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

nlp-library by mihail911

0%
1k
NLP papers for practitioners
created 8 years ago
updated 5 years ago
Feedback? Help us improve.