Discover and explore top open-source AI tools and projects—updated daily.
phuc-ntReal-time speech translation app for desktop
New!
Top 42.2% on SourcePulse
Real-time speech translation is addressed by this desktop application, targeting users on macOS and Windows who require live transcription and translation without relying on intermediary servers. It offers a privacy-conscious approach by utilizing user-provided API keys and integrates multiple free and paid Text-to-Speech (TTS) options, providing a flexible and low-latency solution.
How It Works
The application is built with Tauri, leveraging Rust for the backend and a WebView for the frontend. It captures audio directly from system sources (macOS ScreenCaptureKit, Windows WASAPI) or microphones (cpal). This audio is sent to the Soniox API for Speech-to-Text (STT) and translation, achieving approximately 2-3 seconds of latency. Translations are then displayed in a minimal overlay UI, with optional TTS narration provided by integrated services.
Quick Start & Requirements
git clone https://github.com/phuc-nt/my-translator.git
cd my-translator
npm install
npm run tauri build
Highlighted Details
Maintenance & Community
No specific details regarding maintainers, community channels (like Discord/Slack), or roadmap are present in the provided README.
Licensing & Compatibility
The project is released under the MIT License, permitting commercial use and integration without significant restrictions.
Limitations & Caveats
The core functionality relies on external APIs (Soniox, Google Cloud TTS, ElevenLabs), incurring costs beyond the free tiers. The experimental local mode is restricted to Apple Silicon hardware.
2 days ago
Inactive