VoiceFlow  by infiniV

Private, local speech-to-text for Windows

Created 1 month ago
256 stars

Top 98.7% on SourcePulse

GitHubView on GitHub
Project Summary

VoiceFlow addresses the privacy and cost concerns associated with cloud-based dictation services by offering a fully local, free, and open-source alternative powered by OpenAI's Whisper. It targets privacy-conscious professionals and users who require reliable, low-latency transcription without their voice data leaving their Windows machine. The primary benefit is complete data ownership, offline capability, and zero recurring costs.

How It Works

VoiceFlow processes all audio input and transcription entirely on the user's local hardware, leveraging OpenAI's Whisper models via the faster-whisper (CTranslate2) inference engine. This architecture ensures that voice data remains in RAM and is never transmitted to external servers, guaranteeing unbreakable privacy and eliminating network-dependent latency. The application runs transparently, with a minimal system tray presence, and automatically pastes transcribed text directly at the user's cursor.

Quick Start & Requirements

  • Installer: Download Installer v1.3.1 (Windows, 64-bit, ~150MB).
  • Prerequisites: Windows 10/11. For developers: pnpm is required for setup and development. Initial AI model download is necessary.
  • Developer Setup: Clone the repository (https://github.com/infiniV/VoiceFlow.git), run pnpm run setup, pnpm run dev for development, or pnpm run build:installer to build the installer.

Highlighted Details

  • Privacy: 100% local processing, zero data leaks, no telemetry or analytics.
  • Offline Capability: Fully functional without an internet connection after initial model download.
  • Cost: Free ($0.00) with no subscription fees.
  • Performance: Real-time transcription with zero latency.
  • Models: Supports 16+ Whisper models (e.g., Tiny to Large-v3, Turbo, English-only, Distilled) ranging from 75MB to 3GB, allowing users to balance speed and accuracy.
  • Features: Automatic detection for 99+ languages, customizable hotkeys (Hold/Toggle modes), auto-paste functionality, and a searchable local history database (SQLite).

Maintenance & Community

The project provides links to its Releases and Issues pages on GitHub, facilitating bug reporting and tracking.

Licensing & Compatibility

VoiceFlow is released under the MIT License, permitting commercial use and modification. It is compatible with Windows 10 and 11 (64-bit).

Limitations & Caveats

The application is currently Windows-only. An initial download of AI models is required before offline use can be fully realized. Developer setup relies on pnpm.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
4
Star History
163 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Travis Fischer Travis Fischer(Founder of Agentic).

RealtimeSTT by KoljaB

0.4%
9k
Speech-to-text library for realtime applications
Created 2 years ago
Updated 6 months ago
Feedback? Help us improve.