Discover and explore top open-source AI tools and projects—updated daily.
CoreWorxLabLocal voice assistant with extensible tool capabilities
Top 82.0% on SourcePulse
Summary
CAAL is a local, extensible voice assistant designed for users seeking privacy and customization. It integrates seamlessly with Home Assistant and allows for dynamic capability expansion through auto-discovered n8n workflows, acting as a flexible platform for smart home control and automation.
How It Works
Built on LiveKit Agents, CAAL processes voice commands through a pipeline involving wake word detection (OpenWakeWord), speech-to-text (Speaches, Ollama, or Groq), large language model inference (Ollama or Groq), and text-to-speech (Kokoro or Piper). Its core innovation lies in exposing any n8n workflow as a tool via MCP (Message Communication Protocol), enabling users to easily add custom functionalities. Home Assistant integration is achieved through simplified MCP tools for device control and state querying.
Quick Start & Requirements
Installation involves cloning the repository, copying .env.example to .env, and configuring CAAL_HOST_IP. Deployment is primarily Docker-based:
docker compose up -d. Requires Docker with NVIDIA Container Toolkit; 12GB+ VRAM recommended.docker compose -f docker-compose.cpu.yaml up -d. Utilizes Groq for LLM/STT and Piper for TTS, requiring no GPU../start-apple.sh leverages mlx-audio.docs/DISTRIBUTED-DEPLOYMENT.md.
Post-setup, access the configuration wizard at http://YOUR_SERVER_IP:3000.Highlighted Details
hass_control and hass_get_state tools for seamless smart home management./announce, /wake, /reload-tools) on port 8889 for external integrations.Maintenance & Community
The README does not detail specific contributors, sponsorships, or community channels like Discord/Slack. It lists several related projects, including LiveKit Agents, Speaches, Kokoro-FastAPI, Piper, mlx-audio, Ollama, Groq, n8n, and Home Assistant, suggesting an active ecosystem.
Licensing & Compatibility
The project is released under the permissive MIT License, allowing for commercial use and integration into closed-source applications. Note that HTTPS is required for voice access from non-localhost network devices.
Limitations & Caveats
GPU mode necessitates an NVIDIA GPU with the NVIDIA Container Toolkit and ideally 12GB+ VRAM. The CPU-only mode relies on Groq's cloud API for voice processing, making it not fully local. Initial setup may be slow due to model downloads. Ollama requires specific network binding configurations when run within Docker.
6 days ago
Inactive
openinterpreter
vocodedev
AstrBotDevs