Talking LLM for local voice assistant creation
Top 60.5% on sourcepulse
This project provides a Python-based framework for building an offline, voice-activated AI assistant. It targets users interested in creating personal AI agents similar to Jarvis or Friday, enabling local, internet-free conversational capabilities.
How It Works
The assistant integrates three core open-source components: OpenAI's Whisper for speech-to-text, Ollama serving a Llama-2 model for natural language understanding and response generation, and Suno AI's Bark for text-to-speech synthesis. The workflow involves recording user speech, transcribing it to text, processing the text through the LLM for a response, and finally vocalizing the response using Bark. This modular approach allows for customization and leverages powerful, locally runnable models.
Quick Start & Requirements
openai-whisper
, suno-bark
, langchain
, sounddevice
, pyaudio
, speechrecognition
, and rich
.llama2
pulled (ollama pull llama2
).Highlighted Details
suno/bark-small
for text-to-speech, with potential to use larger models.langchain
is used for managing the conversational chain with Ollama..cpp
implementations.Maintenance & Community
The project is based on a blog post and tutorial, with the primary contributor being duy-huynh. Further community engagement or maintenance status is not detailed in the README.
Licensing & Compatibility
The README does not explicitly state a license for the project code itself. However, it relies on libraries with their own licenses (Whisper, Bark, Langchain, Ollama), which may have implications for commercial use.
Limitations & Caveats
The application can run slowly, particularly on systems without a GPU, due to the resource demands of the Bark model. Performance optimization suggestions are provided but not implemented in the base code.
2 months ago
1 week