LiveWhisper by Nikorasu

Live transcription tool using OpenAI's Whisper

Created 3 years ago

357 stars

Top 78.4% on SourcePulse

Project Summary

This project provides a Python implementation for near real-time speech-to-text transcription using OpenAI's Whisper model and the sounddevice library. It's designed for users who need continuous audio processing and offers an optional voice assistant component for command-based interactions.

How It Works

The core livewhisper.py script captures microphone audio, buffering segments that exceed a volume and frequency threshold. Upon detecting silence, it saves the buffered audio to a temporary file and submits it to the Whisper model for transcription, outputting results sentence-by-sentence. The assistant.py script builds upon this, adding voice command capabilities for tasks like weather, Wikipedia searches, and media control.

Quick Start & Requirements

Install via pip (requires existing Whisper installation).
Dependencies: numpy, scipy, sounddevice, requests, pyttsx3, wikipedia, bs4.
Voice assistant requires espeak and python3-espeak.
Linux users may need to configure PulseAudio for noise/echo cancellation for media controls.
Official documentation and demo links are not provided in the README.

Highlighted Details

Near real-time, sentence-by-sentence transcription.
Voice assistant with customizable wake words ("computer", "hey computer", "okay computer").
Supports weather, date/time, jokes, Wikipedia searches, basic math, and media player control.
Media control functionality may require audio noise cancellation setup.

Maintenance & Community

The project is maintained by Nikorasu.
A Ko-fi link is provided for donations.
No links to community channels, roadmaps, or other social platforms are present.

Licensing & Compatibility

The license is not explicitly stated in the README.
Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is described as a "nearly-live" implementation, implying potential latency. The voice assistant's ability to handle general requests relies on Google's instant-answer snippets, which may not always be reliable. Media control functionality is noted to require specific audio configuration.

LiveWhisper by Nikorasu

Explore Similar Projects

Stage-Whisper by Stage-Whisper

pywhispercpp by absadiki

ollama-voice-mac by apeatling

speech-to-text by reriiasu

Scriberr by rishikanthc

whisper-writer by savbell

whisper_mic by mallorbc

whisper_real_time by davabase

WhisperLive by collabora

RealtimeSTT by KoljaB

ecoute by SevaSk

sherpa-onnx by k2-fsa