input0  by 10xChengTu

macOS voice input tool for local transcription and LLM-powered text refinement

Created 1 month ago
252 stars

Top 99.6% on SourcePulse

GitHubView on GitHub
Project Summary

A macOS voice input tool, Input 0, streamlines dictation by enabling users to hold a hotkey to record, release to transcribe locally via STT, refine with an LLM, and auto-paste into any active text field. It targets macOS users seeking a private, fast, and effortless dictation solution, eliminating the need for window switching and manual text manipulation.

How It Works

The project utilizes a "Press & Speak" mechanism where holding a customizable hotkey initiates audio recording. This audio is then processed locally by one of several AI Speech-to-Text (STT) engines, such as Whisper or SenseVoice, accelerated by the Mac's Metal GPU for enhanced privacy and speed, as audio never leaves the user's device. Following transcription, an optional Large Language Model (LLM), configured via a user-provided API key, refines the text by correcting grammar, removing filler words, structuring sentences, and applying custom vocabulary. The final optimized text is then automatically pasted into the currently active input field across any application.

Quick Start & Requirements

Installation involves downloading the latest .dmg file from GitHub Releases and dragging the Input 0 application into the Applications folder. Users must grant Microphone and Accessibility permissions through macOS System Settings, then restart the app. Configuration requires setting an LLM API key for AI optimization and downloading at least one STT model from within the application. The system requires macOS 11.0+, with an Apple Silicon processor strongly recommended for optimal Metal GPU acceleration.

Highlighted Details

  • Privacy-First Local STT: Employs multiple on-device AI engines (Whisper, SenseVoice, Paraformer, etc.) running via Metal GPU, ensuring all audio processing remains local.
  • AI-Powered Text Optimization: Leverages LLMs to automatically correct grammar, improve structure, handle technical terms, and incorporate custom vocabulary.
  • Seamless Auto-Paste: Intelligently pastes the polished text directly into any active application's input field without manual intervention.
  • Extensive Language Support: Offers transcription capabilities for over 99 languages through a selection of diverse STT models.

Maintenance & Community

The provided README does not contain specific details regarding notable contributors, community channels (like Discord or Slack), or project roadmaps.

Licensing & Compatibility

The project is licensed under CC BY-NC 4.0. This license permits sharing and adaptation but explicitly prohibits commercial use. Compatibility for commercial applications or closed-source linking is therefore restricted.

Limitations & Caveats

AI text optimization is contingent upon the user configuring a valid LLM API key; otherwise, this feature is skipped. The CC BY-NC 4.0 license imposes a strict non-commercial usage restriction. Optimal performance and speed are dependent on Apple Silicon hardware for Metal GPU acceleration.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
5
Issues (30d)
6
Star History
67 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.