Speech-to-text app using Whisper for transcription and translation
Top 55.3% on sourcepulse
This project provides a real-time speech transcription and translation application, leveraging OpenAI's Whisper and free translation APIs. It's designed for users needing live speech-to-text, speech translation, or batch audio/video file processing, offering a user-friendly Tkinter interface.
How It Works
The application integrates OpenAI's Whisper ASR model for accurate speech-to-text and utilizes free translation APIs for language conversion. It supports live microphone input and batch processing of audio/video files, outputting transcriptions and translations in various formats (.txt, .srt, .vtt, etc.). A customizable subtitle window is available for live outputs.
Quick Start & Requirements
pip install -U git+https://github.com/Dadangdut33/Speech-Translate.git --extra-index-url https://download.pytorch.org/whl/cu118
(GPU) or pip install -U git+https://github.com/Dadangdut33/Speech-Translate.git
(CPU). Run with speech-translate
.pip install -r requirements.txt
(add --extra-index-url
for GPU), run Run.py
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
1 day