easy-whisper-ui  by mehtabmahir

Desktop app for fast, GPU-accelerated audio/video transcription

Created 9 months ago
429 stars

Top 69.1% on SourcePulse

GitHubView on GitHub
Project Summary

EasyWhisperUI provides a user-friendly, cross-platform desktop interface for local audio and video transcription using the Whisper model, optimized with GPU acceleration. It targets users seeking efficient, private transcription solutions on their own hardware, offering a unified experience across Windows and macOS with features like batch processing and multi-language support.

How It Works

This project is built upon an Electron architecture, integrating React for the UI and inter-process communication (IPC) for secure communication between the renderer and main processes. It leverages whisper.cpp for core transcription capabilities, enabling GPU acceleration via Vulkan on Windows and Metal on macOS. Media file conversion is handled by FFmpeg. This approach provides a hardened, isolated UI environment and predictable management of Whisper binaries and models.

Quick Start & Requirements

  • Install: Download the Windows installer or macOS .dmg from the project's Releases page. Installation is per-user.
  • Requirements:
    • Windows 10/11 with an AMD, Intel, or NVIDIA GPU supporting Vulkan.
    • macOS with Apple Silicon (M1/M2/M3/M4/M5).
    • Virtual machines require Vulkan support (e.g., GPU passthrough).
    • Linux is not currently supported.

Highlighted Details

  • Cross-platform desktop application (Windows, macOS; Linux planned).
  • GPU acceleration via Vulkan (Windows) and Metal (macOS).
  • Features include live transcription (beta), batch processing queue, translation for 100+ languages, .txt/.srt output formats, drag & drop support, and automatic media conversion using FFmpeg.
  • Supports model and language selection, with automatic model downloads and a console output view during processing.

Maintenance & Community

The project relies on community donations for maintenance. Credits are given to whisper.cpp by Georgi Gerganov and FFmpeg. No specific community channels (like Discord/Slack) or roadmaps are detailed in the provided information.

Licensing & Compatibility

The application is proprietary ("All rights reserved") and explicitly prohibits commercial use, copying, modification, or distribution without the author's permission; these actions are permitted for personal use only. It incorporates whisper.cpp (MIT License) and FFmpeg (LGPL 2.1 License). Commercial use or linking within closed-source projects is restricted.

Limitations & Caveats

Linux support is explicitly stated as not yet implemented. The live transcription feature is marked as beta. The proprietary license severely restricts commercial application and redistribution.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
7
Star History
33 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.