my-translator  by phuc-nt

Real-time speech translation app for desktop

Created 3 weeks ago

New!

844 stars

Top 42.2% on SourcePulse

GitHubView on GitHub
Project Summary

Real-time speech translation is addressed by this desktop application, targeting users on macOS and Windows who require live transcription and translation without relying on intermediary servers. It offers a privacy-conscious approach by utilizing user-provided API keys and integrates multiple free and paid Text-to-Speech (TTS) options, providing a flexible and low-latency solution.

How It Works

The application is built with Tauri, leveraging Rust for the backend and a WebView for the frontend. It captures audio directly from system sources (macOS ScreenCaptureKit, Windows WASAPI) or microphones (cpal). This audio is sent to the Soniox API for Speech-to-Text (STT) and translation, achieving approximately 2-3 seconds of latency. Translations are then displayed in a minimal overlay UI, with optional TTS narration provided by integrated services.

Quick Start & Requirements

Highlighted Details

  • Supports over 70 source languages with real-time translation and ~2-3s latency.
  • Offers three TTS providers: free Edge TTS, free-tier Google Chirp 3 HD, and premium ElevenLabs, with speed control available for Edge and Google.
  • Features a dual-panel view for source/translation side-by-side, smart auto-scrolling, and quick font size adjustments up to 140px.
  • Enables custom translation terms for domain-specific vocabulary (e.g., medical, religious).
  • Includes an experimental local mode for Apple Silicon Macs using MLX, Whisper, and Gemma for on-device translation (JA/EN/ZH/KO → VI/EN).
  • Prioritizes privacy with no intermediary servers, local API key storage, and zero telemetry.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or roadmap are present in the provided README.

Licensing & Compatibility

The project is released under the MIT License, permitting commercial use and integration without significant restrictions.

Limitations & Caveats

The core functionality relies on external APIs (Soniox, Google Cloud TTS, ElevenLabs), incurring costs beyond the free tiers. The experimental local mode is restricted to Apple Silicon hardware.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
13
Star History
854 stars in the last 21 days

Explore Similar Projects

Feedback? Help us improve.