speaktype  by karansinghgit

Offline voice dictation for macOS

Created 4 months ago
307 stars

Top 87.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

SpeakType is a 100% offline, open-source voice dictation application for macOS, designed for users prioritizing privacy and speed. It enables users to dictate text into any application on their Mac, with all processing handled locally, ensuring sensitive data never leaves the device.

How It Works

The application leverages OpenAI's Whisper AI model, integrated via the WhisperKit framework, to perform speech-to-text inference entirely on the user's machine. This local-first approach eliminates the need for internet connectivity and guarantees data privacy. Optimization for Apple Silicon ensures rapid transcription performance.

Quick Start & Requirements

Installation involves downloading the SpeakType.dmg from the latest release, dragging the app to the Applications folder, and granting necessary permissions (Microphone, Accessibility, Documents Folder). Users must then download an AI model via the app's settings. The application requires macOS 13.0+ (Ventura or newer) and benefits significantly from Apple Silicon (M1+). Approximately 2GB of storage is needed for AI models.

Highlighted Details

  • Offline Operation: Fully functional without an internet connection.
  • Privacy-Focused: All audio processing and transcription occur locally.
  • Universal Compatibility: Works seamlessly with any macOS application and text field.
  • Performance: Optimized for Apple Silicon, offering fast dictation speeds.
  • Open Source: Codebase is available for audit and contribution under the MIT license.

Maintenance & Community

The project encourages contributions via standard Git workflows (fork, clone, PR). Specific community channels like Discord or Slack are not detailed in the README. Development relies on Swift 5.9+, SwiftUI, AppKit, and WhisperKit.

Licensing & Compatibility

SpeakType is released under the MIT License, which permits broad use, modification, and distribution, including for commercial purposes and integration into closed-source applications.

Limitations & Caveats

A notable caveat is the initial model loading time, which can range from 30 to 60 seconds when first launching the app or switching between AI models. While Apple Silicon is recommended for optimal performance, usage on Intel-based Macs is not explicitly detailed as a limitation but may result in slower transcription speeds.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
4
Issues (30d)
4
Star History
59 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.