Real-time transcription tool using local Whisper models
Top 51.3% on sourcepulse
This project provides real-time speech transcription using the FastRTC framework for audio streaming and local Hugging Face Transformers models, primarily Whisper. It's designed for developers and researchers needing efficient, on-device speech-to-text capabilities with customizable ASR models and streaming parameters.
How It Works
The system leverages FastRTC to manage live audio streams, including features like Voice Activity Detection (VAD). It integrates with Hugging Face Transformers to run various Automatic Speech Recognition (ASR) models locally. The architecture prioritizes real-time performance by configuring ASR models for a batch size of 1, processing audio chunks as they become available.
Quick Start & Requirements
uv
(recommended) or pip
:
uv venv --python 3.11 && source .venv/bin/activate
uv pip install -r requirements.txt
or
python -m venv .venv && source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
ffmpeg
(install via brew
on macOS or apt
on Debian/Ubuntu)..env
file with UI_MODE
(e.g., fastapi
, gradio
), APP_MODE
(local
or deployed
), MODEL_ID
(e.g., openai/whisper-large-v3-turbo
), SERVER_NAME
, and PORT
.python main.py
uv
, not FastRTC itself. The README implies FastRTC docs exist but doesn't link them.)Highlighted Details
openai/whisper-large-v3-turbo
) for multi-lingual transcription.fastapi
or gradio
).Maintenance & Community
No specific details on contributors, sponsorships, or community channels (Discord/Slack) are provided in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project requires local installation of ffmpeg
and Python 3.10+. The README mentions potential configuration for deployed environments requiring a Turn Server, but detailed instructions are linked externally and may require separate setup. The FastRTC documentation link provided appears to be for uv
installation, not FastRTC itself.
3 weeks ago
1 day