Discover and explore top open-source AI tools and projects—updated daily.
NikorasuLive transcription tool using OpenAI's Whisper
Top 78.4% on SourcePulse
This project provides a Python implementation for near real-time speech-to-text transcription using OpenAI's Whisper model and the sounddevice library. It's designed for users who need continuous audio processing and offers an optional voice assistant component for command-based interactions.
How It Works
The core livewhisper.py script captures microphone audio, buffering segments that exceed a volume and frequency threshold. Upon detecting silence, it saves the buffered audio to a temporary file and submits it to the Whisper model for transcription, outputting results sentence-by-sentence. The assistant.py script builds upon this, adding voice command capabilities for tasks like weather, Wikipedia searches, and media control.
Quick Start & Requirements
pip (requires existing Whisper installation).numpy, scipy, sounddevice, requests, pyttsx3, wikipedia, bs4.espeak and python3-espeak.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is described as a "nearly-live" implementation, implying potential latency. The voice assistant's ability to handle general requests relies on Google's instant-answer snippets, which may not always be reliable. Media control functionality is noted to require specific audio configuration.
3 months ago
1+ week
davabase
KoljaB