Real-time speech-to-text transcription app
Top 71.0% on sourcepulse
This project provides a real-time speech-to-text transcription application designed for users needing live transcription, such as content creators, journalists, or individuals requiring accessibility tools. It leverages OpenAI's Whisper model for accurate transcription and Flet for a cross-platform GUI.
How It Works
The application utilizes Flet to build a user-friendly interface, allowing users to select audio input sources and control transcription parameters. It integrates with OpenAI's Whisper model, offering various model sizes (e.g., Tiny, Base, Small, Medium, Large) to balance performance and accuracy. The app supports translation to English and allows customization of transcription behavior through settings like transcribe_rate
, seconds_of_silence_between_lines
, and max_record_time
.
Quick Start & Requirements
pip install -r requirements.txt
(after activating a Python 3.7 virtual environment).cx_Freeze
for creating executables.Highlighted Details
transcription.txt
and settings to transcriber_settings.yaml
.Maintenance & Community
The project is maintained by davabase. No specific community channels or roadmap are detailed in the README.
Licensing & Compatibility
The code is public domain, allowing for unrestricted use, modification, and distribution, including commercial applications.
Limitations & Caveats
The setup specifies Python 3.7, which is end-of-life. Building executables uses cx_Freeze
due to reported compatibility issues between PyInstaller and PyTorch. Performance is dependent on the chosen Whisper model and system resources.
2 years ago
1 week