py-xiaozhi  by huangjunsen0406

Python voice client for AI assistant "Xiaozhi"

Created 7 months ago
2,501 stars

Top 18.6% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a Python-based client for the "Xiaozhi" AI assistant, targeting users who want to experience its voice and multimodal capabilities without dedicated hardware. It offers AI-driven voice interaction, visual understanding, smart wake-up, and automatic conversation modes, alongside a suite of integrated tools and IoT device control.

How It Works

The core architecture is event-driven, leveraging Python's asyncio for high concurrency and non-blocking operations. It employs a layered design separating application logic, protocols, device management, and UI. Key features include advanced audio processing (Opus, WebRTC AEC, VAD, Sherpa-ONNX offline wake word), dual protocol support (WebSocket/MQTT), and a modular MCP (Micro-Control Platform) tool system for extensibility.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Run: python main.py (GUI mode, default) or python main.py --mode cli
  • Prerequisites: Python 3.9-3.12, Windows 10+, macOS 10.15+, or Linux. Microphone and speaker required. Internet connection for AI services. Optional: Sherpa-ONNX models for wake word, camera for vision.
  • Docs: Project Documentation

Highlighted Details

  • AI voice interaction with natural language processing.
  • Visual multimodal capabilities for image recognition.
  • Integrated tools: system control, scheduling, music playback, 12306 ticket queries, recipes, maps, and more.
  • IoT device integration using a Thing model for smart home control.
  • Advanced audio processing including echo cancellation and offline wake word detection.

Maintenance & Community

The project welcomes contributions and follows PEP8 standards. It acknowledges several contributors and sponsors. Community support channels are not explicitly listed.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The project is primarily for learning and experiencing Xiaozhi's features, not a production-ready replacement for official hardware. Some advanced features like wake word detection require downloading separate models. The README mentions manual reinstallation of dependencies after updates.

Health Check
Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
5
Issues (30d)
23
Star History
228 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.