whispering-ui by Sharrnah

Native UI for live audio transcription/translation

Created 3 years ago

323 stars

Top 83.9% on SourcePulse

Project Summary

This project provides a native UI for the Whispering Tiger application, a tool for real-time audio transcription and translation. It targets users who need to integrate live speech-to-text and translation into various applications like streaming overlays or VRChat, offering a user-friendly interface for configuration and control.

How It Works

The UI acts as a control layer for the Whispering Tiger backend, managing audio input capture (including loopback audio for system sounds), AI model selection (for speech-to-text and translation), and output configuration via WebSockets or OSC. It supports GPU acceleration via CUDA for NVIDIA GPUs, allowing users to balance accuracy and performance by selecting AI model sizes and precision levels, with automatic model downloads.

Quick Start & Requirements

Download the latest release from the Releases Page.
Extract to a folder on a drive with sufficient free space.
Run Whispering Tiger.exe.
Optional but recommended: Install CUDA for NVIDIA GPU acceleration.
Initial run downloads the Whispering Tiger platform and AI models.
Setup involves creating a profile, selecting audio devices, and configuring AI model parameters.

Highlighted Details

Native UI for Windows, with potential future Linux support.
Supports transcription/translation of audio streams and in-game images.
Integrated Text-to-Speech (TTS) with Silero and F5 support.
Plugin architecture for extended functionality (e.g., Realtime Subtitles, RVC).
Loopback audio capture for system audio without extra tools.
Auto-update functionality for the Whispering Tiger backend.

Maintenance & Community

Project has a Discord server for additional help.

Licensing & Compatibility

The README does not explicitly state the license for whispering-ui. The linked whispering repository is MIT licensed.

Limitations & Caveats

Currently Windows-focused, with Linux support pending.
AI model download status is not displayed during the initial setup.
Memory consumption estimates are rough and can vary.

Health Check

Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

1 stars in the last 30 days

Explore Similar Projects

AirTranslate by himomohi

macOS app for live system-audio transcription and translation

Created 1 month ago

Updated 1 day ago

Starred by

Georgi Gerganov

Georgi Gerganov(Author of llama.cpp, whisper.cpp).

easy-whisper-ui by mehtabmahir

Desktop app for fast, GPU-accelerated audio/video transcription

Created 1 year ago

Updated 4 months ago

LiveTranslate by TheDeathDragon

Real-time audio translation for Windows

Created 3 months ago

Updated 2 days ago

LiveWhisper by Nikorasu

Live transcription tool using OpenAI's Whisper

Created 3 years ago

Updated 11 months ago

Starred by

Jong Wook Kim

Jong Wook Kim(Research Scientist at OpenAI).

realtime-transcription-fastrtc by sofdog-gh

Real-time transcription tool using local Whisper models

Created 1 year ago

Updated 1 year ago

lobe-tts by lobehub

TTS/STT library for server and browser apps

Created 2 years ago

Updated 4 months ago

xtts-api-server by daswer123

FastAPI server for XTTSv2 text-to-speech

Created 2 years ago

Updated 2 years ago

Starred by

Emile Vauge

Emile Vauge(Founder of Traefik).

Scriberr by rishikanthc

Self-hosted app for local AI audio transcription

Created 1 year ago

Updated 1 month ago

my-translator by phuc-nt

Real-time speech translation app for desktop

Created 4 months ago

Updated 1 day ago

stt by jianchang512

Offline speech-to-text tool for local audio/video transcription

Created 2 years ago

Updated 5 months ago

Starred by

Luis Capelo

Luis Capelo(Cofounder of Lightning AI) and

Matt Schrage

Matt Schrage(Cofounder of Fig).

WhisperLive by collabora

Real-time transcription app using OpenAI's Whisper

Created 3 years ago

Updated 5 days ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

1 more.

RealtimeSTT by KoljaB

Speech-to-text library for realtime applications

Created 2 years ago

Updated 4 weeks ago

Feedback? Help us improve.