Discover and explore top open-source AI tools and projects—updated daily.
jasoncheng7115On-device AI voice toolkit for real-time transcription, translation, and meeting summarization
Top 83.0% on SourcePulse
This toolkit addresses the need for fully on-device AI voice processing, offering real-time transcription, translation, speaker diarization, and meeting summarization without relying on cloud services. It targets users prioritizing data privacy, security, and cost savings, enabling AI-powered audio analysis for sensitive meetings or any application's audio output.
How It Works
The project employs a modular architecture, integrating various open-source AI models for speech recognition (Whisper variants, Moonshine), translation (NLLB, local LLMs via Ollama), and summarization (local LLMs). Audio is captured at the system level using virtual audio drivers (BlackHole on macOS, WASAPI Loopback on Windows), allowing processing of any application's audio stream. All AI inference occurs locally or on a private GPU server, ensuring data never leaves the user's control.
Quick Start & Requirements
install.sh for macOS, install.ps1 for Windows) automate the download and setup of AI models and dependencies.https://raw.githubusercontent.com/jasoncheng7115/jt-live-whisper/main/install.sh; Windows install script: https://raw.githubusercontent.com/jasoncheng7115/jt-live-whisper/main/install.ps1.Highlighted Details
Maintenance & Community
The project is maintained by Jason Cheng (Jason Tools). No specific community channels (like Discord/Slack) or major contributor/sponsorship information is detailed in the README.
Licensing & Compatibility
The project is licensed under the Apache License 2.0. This permissive license allows for commercial use and integration into closed-source projects.
Limitations & Caveats
Speaker diarization accuracy may vary with audio quality and speaker voice similarity. Translation quality is dependent on the chosen engine, with local LLMs offering the best results but requiring dedicated setup. Summarization functionality necessitates a local LLM server; offline engines only support transcription and translation. Performance is heavily influenced by local hardware capabilities (CPU/GPU). macOS users must configure specific audio routing via "Audio MIDI Setup."
1 week ago
Inactive