freestyle by freestyle-voice

Fast, private voice-to-text dictation app

Created 2 months ago

467 stars

Top 64.3% on SourcePulse

Project Summary

A voice-to-text dictation app, Freestyle offers a local-first, open-source solution for users seeking to convert speech into text significantly faster than typing. It prioritizes user privacy by processing dictations entirely on the device, making it a free and efficient tool for text input across various platforms.

How It Works

Freestyle operates via a simple hotkey: hold the key, speak, and release to have the transcribed text appear at your cursor. The application supports a flexible choice of AI model providers, including OpenAI, Groq, Anthropic, Google, Deepgram, and ElevenLabs, allowing users to leverage their preferred services or bring their own API keys. Post-processing capabilities enhance the output through grammar and punctuation cleanup, removal of filler words, and a custom dictionary for phrase replacements, alongside contextual reformatting for specific applications like emails.

Quick Start & Requirements

Installation: Available as .dmg for macOS (Apple Silicon and Intel), .exe for Windows, and .AppImage / .deb for Linux.
Prerequisites: Requires selection and configuration of a supported AI model provider (OpenAI, Groq, Anthropic, Google, Deepgram, ElevenLabs), which may necessitate API keys.
Community: A Discord server is available for contributor communication.

Highlighted Details

Supports multiple leading AI model providers for voice-to-text conversion.
Features automatic grammar and punctuation cleanup, removing filler words like "um" and "oh".
Includes a custom dictionary for phrase replacements (e.g., "type script" → TypeScript).
Offers contextual reformatting based on the target application, such as automatically formatting emails.

Maintenance & Community

Contributions are welcomed, with setup and local development guidance provided in CONTRIBUTING.md. Project contributors communicate primarily via a dedicated Discord server.

Licensing & Compatibility

Licensed under the MIT license, which is highly permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The effectiveness and cost of transcription are dependent on the chosen external AI model provider and the user's API key. While the client-side operation is local, the core AI inference may rely on cloud services depending on the selected provider.

freestyle by freestyle-voice

Explore Similar Projects

input0 by 10xChengTu

VoiceFlow by infiniV

claude-stt by jarrodwatts

pindrop by watzon

voxt by hehehai

tambourine-voice by kstonekuan

murmure by Kieirra

whispo by egoist

typewhisper-mac by TypeWhisper

amical by amicalhq

freeflow by zachlatta

FluidVoice by altic-dev