freestyle  by freestyle-voice

Fast, private voice-to-text dictation app

Created 2 weeks ago

New!

333 stars

Top 82.4% on SourcePulse

GitHubView on GitHub
Project Summary

A voice-to-text dictation app, Freestyle offers a local-first, open-source solution for users seeking to convert speech into text significantly faster than typing. It prioritizes user privacy by processing dictations entirely on the device, making it a free and efficient tool for text input across various platforms.

How It Works

Freestyle operates via a simple hotkey: hold the key, speak, and release to have the transcribed text appear at your cursor. The application supports a flexible choice of AI model providers, including OpenAI, Groq, Anthropic, Google, Deepgram, and ElevenLabs, allowing users to leverage their preferred services or bring their own API keys. Post-processing capabilities enhance the output through grammar and punctuation cleanup, removal of filler words, and a custom dictionary for phrase replacements, alongside contextual reformatting for specific applications like emails.

Quick Start & Requirements

  • Installation: Available as .dmg for macOS (Apple Silicon and Intel), .exe for Windows, and .AppImage / .deb for Linux.
  • Prerequisites: Requires selection and configuration of a supported AI model provider (OpenAI, Groq, Anthropic, Google, Deepgram, ElevenLabs), which may necessitate API keys.
  • Community: A Discord server is available for contributor communication.

Highlighted Details

  • Supports multiple leading AI model providers for voice-to-text conversion.
  • Features automatic grammar and punctuation cleanup, removing filler words like "um" and "oh".
  • Includes a custom dictionary for phrase replacements (e.g., "type script" → TypeScript).
  • Offers contextual reformatting based on the target application, such as automatically formatting emails.

Maintenance & Community

Contributions are welcomed, with setup and local development guidance provided in CONTRIBUTING.md. Project contributors communicate primarily via a dedicated Discord server.

Licensing & Compatibility

Licensed under the MIT license, which is highly permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The effectiveness and cost of transcription are dependent on the chosen external AI model provider and the user's API key. While the client-side operation is local, the core AI inference may rely on cloud services depending on the selected provider.

Health Check
Last Commit

23 hours ago

Responsiveness

Inactive

Pull Requests (30d)
146
Issues (30d)
85
Star History
334 stars in the last 18 days

Explore Similar Projects

Feedback? Help us improve.