Linguflex  by KoljaB

Voice-based AI companion for home automation and information retrieval

Created 2 years ago
756 stars

Top 46.0% on SourcePulse

GitHubView on GitHub
Project Summary

Linguflex aims to provide a Jarvis-like AI companion experience, enabling voice-based interaction for controlling smart home devices, managing schedules, searching the web, and more. It targets users seeking an advanced, customizable AI assistant and developers interested in contributing to AI evolution. The project emphasizes local operation, low latency, and high-quality audio output through voice cloning.

How It Works

Linguflex operates locally, processing audio input via a "Listen" module, performing cognitive tasks with a "Brain" module (supporting local LLMs or OpenAI API), and generating audio output through a "Speech" module. The "Speech" module leverages advanced TTS technologies like Realtime Voice Cloning (RVC) and fine-tuned XTTS for high-quality, low-latency voice synthesis. Functionality is extended through various modules for music, email, weather, smart home control, and more, with keyword pre-parsing to optimize LLM interaction.

Quick Start & Requirements

Highlighted Details

  • Ollama support is now integrated.
  • Achieves ultra-low latency for language model communication and TTS generation.
  • Offers near-ElevenLabs quality local TTS synthesis via RVC and XTTS.
  • Features a modular design for extensibility, with planned modules for vision, memory, news, finance, and image generation.

Maintenance & Community

The project is a personal passion project actively seeking community contributions and insights. Philip Ehrbright is credited for developing the Ollama support feature.

Licensing & Compatibility

The codebase is MIT licensed. However, TTS model weights (CoquiEngine, ElevenlabsEngine, AzureEngine) have restrictions: they are open-source only for non-commercial projects, with commercial use requiring paid plans or specific tiers. OpenAI engine usage is subject to OpenAI's terms.

Limitations & Caveats

The installation process is noted as challenging and potentially unstable due to complex integrations and Python's dependency management. Commercial use of the core TTS voice generation capabilities is restricted by the underlying model licenses, requiring separate paid plans for many components.

Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
22 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pietro Schirano Pietro Schirano(Founder of MagicPath), and
2 more.

metavoice-src by metavoiceio

0.1%
4k
TTS model for human-like, expressive speech
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.