Real-time transcription tool using faster-whisper
Top 64.9% on sourcepulse
This project provides a real-time speech-to-text transcription system leveraging the Faster-Whisper model and Silero VAD for efficient audio processing. It targets users needing accurate, low-latency transcriptions, offering an HTML-based GUI for configuration and an optional OpenAI API integration for text proofreading.
How It Works
The system captures audio via microphone using sounddevice
, employing Silero VAD to segment speech and discard silence. These segments are then processed by Faster-Whisper for transcription. The architecture prioritizes speed, claiming sub-second transcription for well-separated sentences, and supports advanced Faster-Whisper features like repetition_penalty
and no_repeat_ngram_size
.
Quick Start & Requirements
pip install .
run.bat
for automated setup and execution.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
run.bat
script.1 year ago
Inactive