Whisper-Input  by ErlichLiu

CLI tool for voice transcription using Groq or SiliconFlow

created 6 months ago
545 stars

Top 59.3% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a macOS-native utility for hands-free speech-to-text transcription, leveraging advanced models for rapid and accurate conversion. It's designed for users who rely heavily on voice input, including those with visual impairments, offering a seamless way to dictate text by simply holding a key.

How It Works

The tool captures audio when a designated key (Option/Alt) is pressed and stops upon release. It then sends this audio to either Groq's Whisper Large V3 Turbo or SiliconFlow's SenseVoiceSmall models for transcription. The choice between models allows users to prioritize speed (Groq) or accuracy and punctuation (SiliconFlow), both offered with free usage tiers.

Quick Start & Requirements

  • Install: Clone the repository and set up a Python virtual environment.
  • Prerequisites: Python 3.10+ (3.12.5 recommended; 3.13.1 has known issues).
  • Configuration:
  • Dependencies: pip install pip-tools, pip-compile requirements.in, pip install -r requirements.txt.
  • Run: python main.py.
  • Docs: https://erlich.fun

Highlighted Details

  • Supports real-time transcription with feedback in 1-2 seconds via Groq.
  • Offers SenseVoiceSmall for potentially faster and more accurate results with built-in punctuation.
  • Includes a feature to translate transcribed text from Chinese to English.
  • Actively developing a macOS client with a focus on accessibility.

Maintenance & Community

The project is actively maintained, with recent updates in January 2025 adding support for SiliconFlow, Windows compatibility, and various input/output options. The author welcomes contributions via Fork and PR, and issues can be submitted for problems. Contact available via WeChat for Windows client development interest.

Licensing & Compatibility

The project is released under the MIT license, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The project is primarily focused on macOS, with Windows support added recently. A known issue exists with Python 3.13.1 regarding cursor switching. The project acknowledges the existence of a more feature-rich alternative, WhisperKeyBoard.

Health Check
Last commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
33 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.