local-talking-llm  by vndee

Talking LLM for local voice assistant creation

Created 1 year ago
667 stars

Top 50.5% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a Python-based framework for building an offline, voice-activated AI assistant. It targets users interested in creating personal AI agents similar to Jarvis or Friday, enabling local, internet-free conversational capabilities.

How It Works

The assistant integrates three core open-source components: OpenAI's Whisper for speech-to-text, Ollama serving a Llama-2 model for natural language understanding and response generation, and Suno AI's Bark for text-to-speech synthesis. The workflow involves recording user speech, transcribing it to text, processing the text through the LLM for a response, and finally vocalizing the response using Bark. This modular approach allows for customization and leverages powerful, locally runnable models.

Quick Start & Requirements

  • Install: Requires Python environment setup (e.g., Poetry). Key libraries include openai-whisper, suno-bark, langchain, sounddevice, pyaudio, speechrecognition, and rich.
  • LLM Backend: Ollama must be installed and running, with a model like llama2 pulled (ollama pull llama2).
  • Hardware: A CUDA-enabled GPU is recommended for faster processing, as the Bark model can be resource-intensive.
  • Docs: Original article and demo video available.

Highlighted Details

  • Voice-based interaction with conversational context maintenance.
  • Utilizes suno/bark-small for text-to-speech, with potential to use larger models.
  • langchain is used for managing the conversational chain with Ollama.
  • Offers suggestions for performance optimization using .cpp implementations.

Maintenance & Community

The project is based on a blog post and tutorial, with the primary contributor being duy-huynh. Further community engagement or maintenance status is not detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license for the project code itself. However, it relies on libraries with their own licenses (Whisper, Bark, Langchain, Ollama), which may have implications for commercial use.

Limitations & Caveats

The application can run slowly, particularly on systems without a GPU, due to the resource demands of the Bark model. Performance optimization suggestions are provided but not implemented in the base code.

Health Check
Last Commit

5 days ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
26 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.