Real-time audio transcription system
Top 37.9% on sourcepulse
This project provides a real-time audio transcription system, sending microphone audio to a server for processing and displaying the transcribed text to users. It's designed for applications requiring live speech-to-text capabilities, benefiting users who need immediate textual representation of spoken words.
How It Works
The system employs a client-server architecture. A client (audio_grabber.py
) captures microphone audio, segments it into chunks, and transmits these chunks to a server (transcribe_server.py
). The server utilizes the Whisper model for transcription and returns the results to a separate client (transcribe_listener.html
), which displays the text in real-time. This separation offloads heavy computation to the server, keeping clients lightweight.
Quick Start & Requirements
pip install pyaudio flask requests
python transcribe_server.py
python audio_grabber.py
transcribe_listener.html
./server -m models/ggml-large-v3.bin -l de -p 16 -t 32 --host 0.0.0.0 --port 8007
Highlighted Details
Maintenance & Community
No information on contributors, community channels, or roadmap is available in the README.
Licensing & Compatibility
The README does not specify a license. Compatibility for commercial or closed-source use is undetermined.
Limitations & Caveats
The project is presented as a basic implementation without explicit error handling, advanced features, or performance benchmarks. The README lacks details on supported operating systems, hardware requirements beyond standard Python environments, or specific Whisper model compatibility.
10 months ago
Inactive