wav2vec2-live by oliverguhr

Live speech recognition demo using wav2vec 2.0

Created 4 years ago

375 stars

Top 75.7% on SourcePulse

View on GitHub

2 Experts Love This Project

Andrey Vasnetsov

Cofounder of Qdrant

Patrick von Platen

Author of Hugging Face Diffusers; Research Engineer at Mistral

Project Summary

This project provides a Python library for real-time speech recognition using Hugging Face's wav2vec 2.0 models. It allows users to leverage various pre-trained wav2vec 2.0 models directly from their microphone input, enabling applications like live transcription or voice command interfaces.

How It Works

The library utilizes the wav2vec2 model architecture for speech-to-text conversion. It captures audio input from the system's default microphone, processes it in chunks, and feeds it to the specified wav2vec 2.0 model for inference. The approach allows for flexible use of any model available on the Hugging Face Hub, with automatic downloading on first use.

Quick Start & Requirements

Install via pip install -r requirements.txt after setting up a virtual environment.
Requires portaudio19-dev on Ubuntu for pyaudio.
Usage example: python live_asr.py
Official documentation and demo links are not provided in the README.

Highlighted Details

Supports any wav2vec 2.0 model from the Hugging Face model hub.
Processes audio directly from the system's default audio device.
Provides real-time inference time and sample length alongside transcribed text.

Maintenance & Community

No information on contributors, sponsorships, community channels, or roadmap is available in the README.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The project relies on the system's default audio device, requiring manual configuration if it's not set correctly. The README mentions a potential "attempt to connect to server failed" message from pyaudio which can be safely ignored if the JACK audio server is not in use.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days