Tool for local voice assistant using speech-to-text and LLM
Top 82.4% on sourcepulse
This project provides an offline voice interface for local LLM interaction, combining Whisper for speech-to-text, Ollama for LLM processing, and pyttsx3 for text-to-speech. It's designed for users who want to interact with local AI models via voice without relying on cloud services.
How It Works
The system orchestrates three core components: Whisper for local, offline speech recognition, Ollama for running local LLM models, and pyttsx3 for offline text-to-speech synthesis. The workflow involves capturing audio via a key press, transcribing it with Whisper, sending the text to a locally running Ollama instance, and then converting the LLM's response back into speech using pyttsx3.
Quick Start & Requirements
pip install .
large-v3.pt
) and place it in the whisper
subfolder.assistant.yaml
.assistant.py
.Highlighted Details
assistant.yaml
(defaults to French and Mistral model).Maintenance & Community
No specific community links or notable contributors are mentioned in the README.
Licensing & Compatibility
The README does not specify a license.
Limitations & Caveats
The project is described as a "simple combination" and lists "Rearrange code base" and "Multi threading" as to-do items, suggesting it is in an early stage of development. GPU setup for Whisper is a prerequisite.
1 year ago
Inactive