open-xiaoai-bridge by coderzc

Connect smart speakers to advanced AI services

Created 3 months ago

270 stars

Top 95.0% on SourcePulse

Project Summary

Summary

Open-XiaoAI Bridge provides a server application to integrate external AI services (OpenAI-compatible, OpenClaw, XiaoZhi AI) with Xiaomi's XiaoAI smart speakers, breaking their closed ecosystem. It offers enhanced functionality via a remote HTTP API, targeting users and developers seeking to customize speaker capabilities.

How It Works

A Python server communicates with a Rust client on the speaker via WebSocket. It processes audio streams using Voice Activity Detection (VAD) and Keyword Spotting (KWS) for efficiency. Audio is then routed to configured AI backends, supporting local ASR (SherpaASR) or XiaoAI's native ASR. The system features a modular design for enabling specific integrations.

Quick Start & Requirements

Prerequisites: Requires flashing XiaoAI speaker firmware (SSH enabled) and installing the Rust client program. Local ASR/TTS models may need to be downloaded.
Installation: Docker Compose is recommended: download config.py and docker-compose.yml, configure them, and run docker compose up -d. Local compilation involves cloning the repository, installing dependencies (uv, Rust), and running ./scripts/start.sh.
Configuration: Managed via config.py and environment variables to enable/disable services, set API endpoints, authentication tokens, and AI backend parameters.
Links: Demo ①, Demo ②, Quick Start, API Docs, FAQ.

Highlighted Details

OpenAI Compatibility: Seamless integration with services like OpenAI, Ollama, and LM Studio via the /v1/chat/completions endpoint.
OpenClaw Integration: Advanced features include custom wake words, multi-agent routing, continuous conversation, voice cloning (via Doubao TTS), and streaming playback.
Multi-Agent Routing: Enables distinct AI personalities activated by unique wake words on a single speaker.
Continuous Conversation: Supports multi-turn dialogues without repeated wake-ups, with interruption capability.
HTTP API: Provides remote control for text/audio playback and device management.
Modular Design: Features like XiaoZhi AI, OpenClaw, OpenAI compatibility, and HTTP API can be independently enabled.

Maintenance & Community

The project is maintained by coderzc. No specific community channels (Discord, Slack) or sponsorship details are provided in the README.

Licensing & Compatibility

License: MIT License.
Compatibility: The MIT license permits commercial use and integration into closed-source projects. However, adoption requires modifying the XiaoAI speaker's firmware.

Limitations & Caveats

Firmware Modification: Essential for setup, posing a significant barrier and potential warranty issue.
External Dependencies: Relies on external AI services or locally downloaded large model files for advanced features.
Setup Complexity: Initial setup involves firmware flashing, client installation, and configuration, demanding technical expertise.

Health Check

Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)

2

Issues (30d)

3

Star History

41 stars in the last 30 days

Explore Similar Projects

pipecat-client-web by pipecat-ai

Real-time voice and multimodal AI web SDK

Created 2 years ago

Updated 1 week ago

lettabot by letta-ai

AI assistant with persistent, multi-channel memory

Created 5 months ago

Updated 1 month ago

com.openai.unity by RageAgainstThePixel

Unity SDK for OpenAI API access

Created 5 years ago

Updated 5 months ago

unity-AI-Chat-Toolkit by zhangliwei7758

Unity toolkit for AI chat functionality

Created 2 years ago

Updated 1 year ago

xiaozhi-esp32-server-golang by hackers365

High-performance AI backend for voice-driven IoT and edge devices

Created 1 year ago

Updated 1 week ago

JiwuChat by KiWi233333

Cross-platform chat app with AI bot integration

Created 2 years ago

Updated 1 week ago

openab by openabdev

Bridging chat platforms with AI coding assistants

Created 3 months ago

Updated 1 day ago

aoai-realtime-audio-sdk by Azure-Samples

Azure OpenAI SDK for real-time audio processing with GPT-4o

Created 1 year ago

Updated 9 months ago

xiaozhi-android-client by TOM88812

Cross-platform Flutter app for AI voice/text chat

Created 1 year ago

Updated 1 month ago

Starred by

Jason Huggins

Jason Huggins(Creator of Selenium),

Junyang Lin

Junyang Lin(Core Maintainer at Alibaba Qwen), and

3 more.

01 by openinterpreter

Open-source voice interface for desktop, mobile, and ESP32 chips

Created 2 years ago

Updated 1 year ago

wukong-robot by wzpan

Chinese voice assistant and smart speaker project

Created 7 years ago

Updated 1 year ago

Starred by

Chaoyu Yang

Chaoyu Yang(Founder of Bento),

Nir Gazit

Nir Gazit(Cofounder of Traceloop), and

4 more.

pipecat by pipecat-ai

Open-source framework for building real-time voice and multimodal conversational AI agents

Created 2 years ago

Updated 1 day ago

Feedback? Help us improve.