whispering-ui  by Sharrnah

Native UI for live audio transcription/translation

Created 2 years ago
292 stars

Top 90.3% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a native UI for the Whispering Tiger application, a tool for real-time audio transcription and translation. It targets users who need to integrate live speech-to-text and translation into various applications like streaming overlays or VRChat, offering a user-friendly interface for configuration and control.

How It Works

The UI acts as a control layer for the Whispering Tiger backend, managing audio input capture (including loopback audio for system sounds), AI model selection (for speech-to-text and translation), and output configuration via WebSockets or OSC. It supports GPU acceleration via CUDA for NVIDIA GPUs, allowing users to balance accuracy and performance by selecting AI model sizes and precision levels, with automatic model downloads.

Quick Start & Requirements

  • Download the latest release from the Releases Page.
  • Extract to a folder on a drive with sufficient free space.
  • Run Whispering Tiger.exe.
  • Optional but recommended: Install CUDA for NVIDIA GPU acceleration.
  • Initial run downloads the Whispering Tiger platform and AI models.
  • Setup involves creating a profile, selecting audio devices, and configuring AI model parameters.

Highlighted Details

  • Native UI for Windows, with potential future Linux support.
  • Supports transcription/translation of audio streams and in-game images.
  • Integrated Text-to-Speech (TTS) with Silero and F5 support.
  • Plugin architecture for extended functionality (e.g., Realtime Subtitles, RVC).
  • Loopback audio capture for system audio without extra tools.
  • Auto-update functionality for the Whispering Tiger backend.

Maintenance & Community

  • Project has a Discord server for additional help.

Licensing & Compatibility

  • The README does not explicitly state the license for whispering-ui. The linked whispering repository is MIT licensed.

Limitations & Caveats

  • Currently Windows-focused, with Linux support pending.
  • AI model download status is not displayed during the initial setup.
  • Memory consumption estimates are rough and can vary.
Health Check
Last Commit

17 hours ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Travis Fischer Travis Fischer(Founder of Agentic).

RealtimeSTT by KoljaB

0.4%
9k
Speech-to-text library for realtime applications
Created 2 years ago
Updated 3 months ago
Feedback? Help us improve.