AudioWhisper  by mazdak

Lightweight macOS audio transcription app

Created 9 months ago
257 stars

Top 98.3% on SourcePulse

GitHubView on GitHub
Project Summary

AudioWhisper is a lightweight macOS menu bar app for rapid audio-to-text transcription, supporting OpenAI Whisper, Google Gemini, and local engines like WhisperKit and Parakeet-MLX. It targets macOS users needing quick, convenient transcription with features like hotkeys, automatic clipboard copying, and optional semantic cleanup, enhancing productivity.

How It Works

The app uses a global hotkey to trigger audio recording and transcription. It offers a hybrid approach with cloud (OpenAI, Gemini) and local (WhisperKit, Parakeet-MLX for Apple Silicon) transcription engines. A key feature is optional semantic cleanup using local MLX or cloud providers, with customizable categories for improved accuracy in specific contexts.

Quick Start & Requirements

Installation is recommended via Homebrew (brew tap mazdak/tap && brew install audiowhisper). Pre-built apps are also available. The application requires macOS 14.0 (Sonoma) or later; Apple Silicon is strongly recommended for optimal performance and local MLX/Parakeet-MLX features. Disk space up to ~2.5 GB may be needed for local models. Cloud transcription requires OpenAI or Google Gemini API keys.

Highlighted Details

  • Multiple Transcription Engines: Cloud (OpenAI, Gemini) and local (WhisperKit, Parakeet-MLX).
  • Semantic Cleanup: Optional post-processing with local MLX (Apple Silicon) or cloud, featuring app-aware categories.
  • File Transcription: Transcribe existing audio files directly from the menu bar.
  • History & Usage Dashboard: Opt-in local transcript storage, search, retention, and productivity insights.
  • Smart Paste & Focus: Auto-pastes text and restores focus to the originating app.
  • Secure by Default: API keys in Keychain; local modes keep audio on-device.

Maintenance & Community

The project appears actively maintained. Specific details on core contributors, sponsorships, or community channels (e.g., Discord, Slack) are not explicitly provided in the README.

Licensing & Compatibility

Distributed under the permissive MIT License, allowing for commercial use. Requires macOS 14.0+; Apple Silicon is recommended for advanced local features.

Limitations & Caveats

Requires macOS 14.0 (Sonoma) minimum. Apple Silicon is highly recommended for local MLX/Parakeet-MLX features. Features like Smart Paste and Press & Hold require explicit user permissions (Input Monitoring, Accessibility). Cloud transcription necessitates API keys and may incur costs.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
9 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.