EEND  by hitachi-speech

Speaker diarization research paper using end-to-end neural networks

created 5 years ago
404 stars

Top 72.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides an End-to-End Neural Diarization (EEND) system, a neural-network-based approach to speaker diarization. It is designed for researchers and practitioners in speech processing who need a flexible framework for speaker diarization tasks, offering implementations of BLSTM and self-attentive models, including extensions for an unknown number of speakers.

How It Works

The EEND system utilizes a neural network to directly predict speaker activity segments without relying on traditional clustering methods. It employs a permutation-free objective function to handle the inherent ambiguity in speaker assignment. The self-attentive models incorporate attention mechanisms to better capture long-range dependencies in speech, potentially improving accuracy, especially in complex scenarios.

Quick Start & Requirements

  • Install: Clone the repository and run cd tools && make to build Kaldi and set up the environment.
  • Prerequisites: NVIDIA GPU with CUDA Toolkit (version 8.0 to 10.1), Python environment.
  • Setup: Building Kaldi and installing dependencies can take a significant amount of time.
  • Docs: Kaldi Queue Documentation

Highlighted Details

  • Implements BLSTM and self-attentive EEND models.
  • Supports diarization for an unknown number of speakers using encoder-decoder based attractors.
  • Includes recipes for mini_librispeech and CALLHOME datasets.

Maintenance & Community

The project is associated with Hitachi Speech. No specific community channels or active development signals are immediately apparent from the README.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

The CUDA Toolkit version requirement (8.0 <= version <= 10.1) is quite restrictive and may not be compatible with modern NVIDIA drivers or GPUs. The setup process, particularly building Kaldi, is complex and time-consuming.

Health Check
Last commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.