Discover and explore top open-source AI tools and projects—updated daily.
lhlLocal AI voicechat using WebSockets
Top 46.9% on SourcePulse
This project provides a fast, fully local AI voice chat system using WebSockets, targeting users who want to build or experiment with real-time, conversational AI agents. It offers modularity for Speech-to-Text (SRT), Large Language Model (LLM), and Text-to-Speech (TTS) components, enabling customizable voice chat experiences with low latency.
How It Works
The system employs a modular architecture, allowing users to swap out SRT, LLM, and TTS backends. It utilizes WebSockets for communication, enabling simple remote access. Key components include Voice Activity Detection (VAD) via ricky0123/vad and Opus audio support via symblai/opus-encdec. The modularity allows integration with various popular libraries like whisper.cpp, faster-whisper, llama.cpp, Coqui TTS, StyleTTS2, and Piper.
Quick Start & Requirements
pip install -r requirements.txt), and build llama.cpp with ROCm (GGML_HIPBLAS=1) or CUDA (GGML_CUDA=1) support.byobu, curl, wget, espeak-ng, ffmpeg, libopus0, libopus-dev.Meta-Llama-3-8B-Instruct-Q4_K_M.gguf).Highlighted Details
run-voicechat2.sh, remote-tunnel.sh, local-tunnel.sh) for deployment and remote access.Maintenance & Community
The project is maintained by lhl. No specific community channels (Discord/Slack) or roadmap are explicitly mentioned in the README.
Licensing & Compatibility
The README does not explicitly state a license for the voicechat2 repository itself. However, it references other projects with MIT and Apache 2.0 licenses. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
Installation instructions are specific to Ubuntu LTS and assume prior ROCm/CUDA setup. The project does not specify a license, which may impact commercial adoption. Some referenced related projects have unclear or missing licenses.
1 year ago
Inactive