Linguflex  by KoljaB

Voice-based AI companion for home automation and information retrieval

created 2 years ago
737 stars

Top 48.0% on sourcepulse

GitHubView on GitHub
Project Summary

Linguflex aims to provide a Jarvis-like AI companion experience, enabling voice-based interaction for controlling smart home devices, managing schedules, searching the web, and more. It targets users seeking an advanced, customizable AI assistant and developers interested in contributing to AI evolution. The project emphasizes local operation, low latency, and high-quality audio output through voice cloning.

How It Works

Linguflex operates locally, processing audio input via a "Listen" module, performing cognitive tasks with a "Brain" module (supporting local LLMs or OpenAI API), and generating audio output through a "Speech" module. The "Speech" module leverages advanced TTS technologies like Realtime Voice Cloning (RVC) and fine-tuned XTTS for high-quality, low-latency voice synthesis. Functionality is extended through various modules for music, email, weather, smart home control, and more, with keyword pre-parsing to optimize LLM interaction.

Quick Start & Requirements

Highlighted Details

  • Ollama support is now integrated.
  • Achieves ultra-low latency for language model communication and TTS generation.
  • Offers near-ElevenLabs quality local TTS synthesis via RVC and XTTS.
  • Features a modular design for extensibility, with planned modules for vision, memory, news, finance, and image generation.

Maintenance & Community

The project is a personal passion project actively seeking community contributions and insights. Philip Ehrbright is credited for developing the Ollama support feature.

Licensing & Compatibility

The codebase is MIT licensed. However, TTS model weights (CoquiEngine, ElevenlabsEngine, AzureEngine) have restrictions: they are open-source only for non-commercial projects, with commercial use requiring paid plans or specific tiers. OpenAI engine usage is subject to OpenAI's terms.

Limitations & Caveats

The installation process is noted as challenging and potentially unstable due to complex integrations and Python's dependency management. Commercial use of the core TTS voice generation capabilities is restricted by the underlying model licenses, requiring separate paid plans for many components.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
83 stars in the last 90 days

Explore Similar Projects

Starred by Thomas Wolf Thomas Wolf(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

ultravox by fixie-ai

0.4%
4k
Multimodal LLM for real-time voice interactions
created 1 year ago
updated 4 days ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

MiniCPM-o by OpenBMB

0.2%
20k
MLLM for vision, speech, and multimodal live streaming on your phone
created 1 year ago
updated 1 month ago
Feedback? Help us improve.