Discover and explore top open-source AI tools and projects—updated daily.
kstonekuanUniversal voice interface for seamless app dictation
Top 92.8% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Tambourine offers a customizable, open-source voice interface for any application, acting as a privacy-focused alternative to proprietary dictation tools. It enables users to dictate text naturally at their cursor, significantly faster than typing, with AI-powered formatting.
How It Works
A Tauri desktop app (Rust/React) captures audio via hotkeys and communicates with a Python FastAPI backend. The backend streams audio using WebRTC to various Speech-to-Text (STT) and Large Language Model (LLM) providers (cloud or local like Whisper/Ollama) for transcription and intelligent text cleaning (punctuation, filler removal, custom terms). Processed text is returned to the app for direct cursor input. This modular design prioritizes user control over AI services and formatting rules.
Quick Start & Requirements
cd app && pnpm install && pnpm dev) and run Python server (cd server && uv sync && uv run python main.py). Docker deployment is available for the server.libwebkit2gtk-4.1-dev, build-essential). Microphone access and macOS Accessibility permissions are mandatory. API keys for chosen STT and LLM providers (e.g., Cartesia, Deepgram, OpenAI, Groq, Gemini) are required. Local STT/LLM requires Ollama and Whisper setup.CONTRIBUTING.md for development setup.Highlighted Details
Ctrl+Alt+) and toggle recording (Ctrl+Alt+Space) modes.Maintenance & Community
CONTRIBUTING.md.Licensing & Compatibility
Limitations & Caveats
⚠️). Mobile platforms (Android/iOS) are unsupported.1 day ago
Inactive