openwhispr by OpenWhispr

AI-powered dictation app with local and cloud processing

Created 8 months ago

1,395 stars

Top 28.6% on SourcePulse

Project Summary

Summary

OpenWhispr is an open-source desktop dictation application designed for flexible and private speech-to-text conversion. It targets users seeking a privacy-first, cross-platform solution that integrates with various AI models for enhanced functionality beyond simple transcription. The primary benefit is a customizable, locally-processed or cloud-connected dictation tool that automatically pastes text and responds to AI commands.

How It Works

This cross-platform application is built using Electron, React 19, TypeScript, and Tailwind CSS v4, with Vite for optimized builds. It offers a dual-mode processing architecture: local transcription via OpenAI Whisper models (managed through Python) for maximum privacy, or cloud-based processing leveraging APIs from OpenAI, Anthropic, and Google. Users interact via a global hotkey, with transcribed text automatically pasted at the cursor. An "Agent Naming" feature allows for natural language commands to AI models.

Quick Start & Requirements

Installation involves cloning the repository (git clone https://github.com/HeroTools/open-whispr.git) and running npm install. Prerequisites include Node.js 18+ and npm. Python 3.7+ is optional but automatically installed for local Whisper processing. macOS 10.15+, Windows 10+, or Linux are supported. On macOS, Xcode Command Line Tools are needed for Globe key support. API keys for cloud providers can be configured via an .env file or the in-app Control Panel. Development mode is started with npm run dev, and production with npm start. Building for personal use (unsigned) uses npm run pack.

Highlighted Details

Flexible AI Integration: Supports OpenAI (GPT-5/4.1/o-series), Anthropic (Claude 4.1/4/3.5), Google Gemini (2.5 Pro/Flash), and local models (Qwen, LLaMA, Mistral) via llama.cpp.
Privacy-Focused: Offers local processing with downloadable Whisper models (tiny to large) to keep voice data private.
Modern Tech Stack: Built with React 19, TypeScript, Tailwind CSS v4, and Vite, providing a fast, modern UI.
User Experience: Features a global hotkey, automatic text pasting, a draggable interface, and agent naming for conversational AI commands.
Model Management: Includes tools for downloading, managing, and cleaning up local Whisper models.

Maintenance & Community

The project is actively maintained and marked as ready for production use (current version 1.0.4). No specific community channels (like Discord/Slack) or details on notable contributors were found in the provided text.

Licensing & Compatibility

Licensed under the MIT License, permitting free use, modification, and distribution for both personal and commercial purposes without significant restrictions.

Limitations & Caveats

Building distributable, signed applications requires code signing certificates and potentially developer accounts. Unsigned builds on macOS may trigger security warnings. Automatic text pasting necessitates granting accessibility permissions. The README mentions future AI model releases (September 2025), which are forward-looking claims. Local processing requires sufficient disk space for downloaded Whisper models.

Health Check

Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

534 stars in the last 30 days