Speech-Translate by Dadangdut33

Speech-to-text app using Whisper for transcription and translation

Created 3 years ago

624 stars

Top 53.0% on SourcePulse

Project Summary

This project provides a real-time speech transcription and translation application, leveraging OpenAI's Whisper and free translation APIs. It's designed for users needing live speech-to-text, speech translation, or batch audio/video file processing, offering a user-friendly Tkinter interface.

How It Works

The application integrates OpenAI's Whisper ASR model for accurate speech-to-text and utilizes free translation APIs for language conversion. It supports live microphone input and batch processing of audio/video files, outputting transcriptions and translations in various formats (.txt, .srt, .vtt, etc.). A customizable subtitle window is available for live outputs.

Quick Start & Requirements

Installation:
- Prebuilt Binary (.exe): Download from releases. Requires CUDA 11.8 compatible GPU.
- As a Module: pip install -U git+https://github.com/Dadangdut33/Speech-Translate.git --extra-index-url https://download.pytorch.org/whl/cu118 (GPU) or pip install -U git+https://github.com/Dadangdut33/Speech-Translate.git (CPU). Run with speech-translate.
- From Git: Clone repo, set up virtual environment, pip install -r requirements.txt (add --extra-index-url for GPU), run Run.py.
Prerequisites: Python 3.8+ (3.11 recommended). GPU with CUDA compatibility recommended for performance. Windows 8+ for speaker input (or use loopback tools). Internet connection required for API translation and model downloads. Noto Emoji font recommended for UI.
Resources: Whisper models range from ~39MB (tiny) to 1.5GB (large), requiring VRAM from ~1GB to 10GB+.
Docs: Wiki

Highlighted Details

Supports live transcription and translation from microphone input.
Batch processing for audio/video files with multiple output formats.
Customizable subtitle window for real-time display.
Option to integrate local LibreTranslate for offline use.

Maintenance & Community

Active development with contributions welcomed.
GitHub Repository

Licensing & Compatibility

MIT License. Permissive for commercial use and closed-source linking.

Limitations & Caveats

Prebuilt binaries are Windows-only and require CUDA 11.8.
Speaker input is Windows 8+ specific; alternative audio capture methods are needed for other OS.
Build script is currently only configured for Windows.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

2 stars in the last 30 days

Explore Similar Projects

yt-transcriber by pmarreck

YouTube transcription TUI app

Created 1 year ago

Updated 3 weeks ago

Pandrator by lukaszliniewicz

GUI framework for audiobook, subtitle, and dubbing generation

Created 1 year ago

Updated 8 months ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

babelfish.ai by supabase-community

Realtime transcription/translation app using browser-based models

Created 1 year ago

Updated 1 year ago

AudioToText by Carleslc

CLI tool for audio transcription and translation

Created 2 years ago

Updated 2 years ago

Starred by

Georgi Gerganov

Georgi Gerganov(Author of llama.cpp, whisper.cpp).

transcriber_app by davabase

Real-time speech-to-text transcription app

Created 3 years ago

Updated 3 years ago

generate-subtitles by mayeaux

Web app for audio/video transcription and translation

Created 3 years ago

Updated 2 years ago

LanguageLeapAI by SociallyIneptWeeb

Real-time AI translator for cross-lingual online communication

Created 2 years ago

Updated 2 years ago

Starred by

Travis Fischer

Travis Fischer(Founder of Agentic).

writeout.ai by beyondcode

Web app for audio transcription and translation

Created 2 years ago

Updated 2 years ago

SoniTranslate by R3gm

Gradio web UI for video translation with synchronized audio

Created 2 years ago

Updated 1 month ago

Starred by

Luis Capelo

Luis Capelo(Cofounder of Lightning AI) and

Matt Schrage

Matt Schrage(Cofounder of Fig).

WhisperLive by collabora

Real-time transcription app using OpenAI's Whisper

Created 2 years ago

Updated 3 months ago

VideoCaptioner by WEIFENG2333

Subtitle tool for video transcription, translation, and editing using LLMs

Created 1 year ago

Updated 14 hours ago

pyvideotrans by jianchang512

Video translation CLI tool

Created 2 years ago

Updated 3 weeks ago

Feedback? Help us improve.