talk  by yacineMTB

Local conversational engine demo with audio

created 2 years ago
591 stars

Top 55.8% on sourcepulse

GitHubView on GitHub
Project Summary

This project aims to create a local, conversational AI engine for interacting with computers, targeting users comfortable with technical tinkering. It enables voice-based interaction with large language models and text-to-speech systems, offering a foundation for custom HCI applications.

How It Works

The engine utilizes an event-based architecture, integrating components like Whisper.cpp for speech-to-text and Llama.cpp for language processing, along with a TTS engine (Piper). It processes audio input, converts it to text, feeds it to the LLM for response generation, and then synthesizes the response back into speech. This local-first approach prioritizes privacy and offline functionality.

Quick Start & Requirements

  • Install: chmod 775 build.sh && ./build.sh (experimental) or manual installation via npm install and compiling submodules.
  • Prerequisites: Node.js v14.15+, Piper TTS engine (added to PATH), Graphviz (optional), CUDA (for GPU acceleration), SoX (for mic input).
  • Setup: Requires manual compilation of C++ submodules (Whisper.cpp, Llama.cpp) and downloading LLM weights. Estimated setup time is significant due to manual steps and potential debugging.
  • Links: Whisper.cpp, Llama.cpp, Piper TTS

Highlighted Details

  • Runs completely locally for enhanced privacy.
  • Supports GPU acceleration via CUBLAS for Whisper.cpp.
  • Event-based architecture for modularity.
  • Includes optional Graphviz for visualizing the event graph.

Maintenance & Community

The project was last updated in June 2023. The README indicates a focus on improving setup and usability, suggesting ongoing development. There are no explicit links to community channels or roadmaps provided.

Licensing & Compatibility

The README does not explicitly state a license. The project integrates Whisper.cpp and Llama.cpp, which have their own licenses (typically MIT-like). Compatibility for commercial use would require verifying the licenses of all integrated components.

Limitations & Caveats

The project is explicitly stated to be in an early stage with a complex and non-straightforward setup process, requiring significant "elbow grease." The intended audience is explicitly those comfortable with "hacking things together."

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.