wespeaker by wenet-e2e

Speaker toolkit for verification, recognition, and diarization research

Created 4 years ago

1,212 stars

Top 32.0% on SourcePulse

Project Summary

WeSpeaker is a comprehensive toolkit for speaker embedding learning, enabling speaker verification, recognition, and diarization. It caters to researchers and developers seeking to implement and advance state-of-the-art speaker-related tasks, offering both command-line and Python API interfaces for ease of use.

How It Works

WeSpeaker focuses on learning robust speaker embeddings, supporting various advanced neural network architectures like ECAPA-TDNN, ResNet, and ERes2Net. It allows for online feature extraction or loading pre-extracted features in Kaldi format, providing flexibility in data handling. The toolkit emphasizes performance, demonstrated by achieving state-of-the-art results on benchmarks like VoxCeleb and CNCeleb, and supports techniques such as self-supervised learning and score calibration.

Quick Start & Requirements

Install: pip install git+https://github.com/wenet-e2e/wespeaker.git
Development Install: Requires PyTorch >= 1.12.1, torchaudio, cudatoolkit=11.3, and Python 3.9.
Prerequisites: CUDA 11.3 is recommended for GPU acceleration.
Resources: Development setup involves creating a conda environment and installing dependencies.
Docs: Docs

Highlighted Details

Supports speaker verification, recognition, and diarization tasks.
Achieves state-of-the-art performance on VoxCeleb and CNCeleb datasets.
Offers various frontend options, including Whisper-encoder and WavLM.
Includes recipes for NIST SRE16 and VoxConverse datasets.

Maintenance & Community

The project is actively maintained with frequent updates, including support for new models and techniques. Community discussions are facilitated via WeChat.

Licensing & Compatibility

The repository does not explicitly state a license in the README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

The absence of a clearly stated license is a significant caveat for adoption, particularly in commercial or closed-source environments.

Health Check

Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History