Discover and explore top open-source AI tools and projects—updated daily.
KoljaBSpeech-to-text library for realtime applications
Top 5.7% on SourcePulse
This library provides a robust, low-latency speech-to-text (STT) solution for real-time applications, featuring voice activity detection (VAD) and wake word activation. It's designed for voice assistants and applications requiring fast, accurate speech-to-text conversion, offering an easy-to-use interface for developers.
How It Works
RealtimeSTT leverages a multi-component architecture for efficient processing. Voice Activity Detection is handled by a combination of WebRTCVAD for initial detection and SileroVAD for enhanced accuracy. Speech-to-text transcription is powered by Faster-Whisper, known for its GPU-accelerated, real-time performance. Wake word detection is supported by either Porcupine or OpenWakeWord, providing flexibility in activation methods.
Quick Start & Requirements
pip install RealtimeSTTsudo apt-get update && sudo apt-get install python3-dev portaudio19-devbrew install portaudiopip install torch==2.5.1+cu118 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118). Full CUDA setup involves installing the NVIDIA CUDA Toolkit and cuDNN.Highlighted Details
tiny to large-v2) and language auto-detection.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
ctranslate2 and cuDNN versions can cause loading errors, requiring downgrades or upgrades.3 months ago
1 day
davabase