xiaozhi-esp32-music  by Maggotxy

ESP32 AI robot firmware for voice interaction and music

Created 8 months ago
259 stars

Top 97.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides open-source firmware for the Xiaozhi AI robot, enabling music playback with lyrics display on ESP32-based hardware. Targeting hobbyists and developers, it enhances the robot's functionality by integrating AI voice interaction with media control, offering a flexible platform for custom projects and commercial applications under the permissive MIT license.

How It Works

The firmware leverages the Xiaozhi AI Chat Robot framework, utilizing large language models (LLMs) like Qwen or DeepSeek for voice interaction via the MCP protocol. It specifically implements a self.music.play_song tool for music playback, supporting OPUS audio codec and lyrics display. The architecture follows a streaming ASR + LLM + TTS pipeline, enabling on-device hardware control (volume, GPIO) and cloud-based extensions for smart home or PC operations.

Quick Start & Requirements

  • Installation: Beginners can flash pre-built firmware, defaulting to the xiaozhi.me server for free Qwen model access. For development, use VSCode with ESP-IDF (v5.4+) on Linux (preferred) or Windows.
  • Prerequisites: ESP-IDF SDK (v5.4+), C++ coding standards.
  • Hardware: Supports ESP32-C3, ESP32-S3, and ESP32-P4 chip platforms, with compatibility for over 70 specific development boards (e.g., M5Stack CoreS3, LILYGO T-Circle-S3).
  • Community: Join QQ Group: 826072986.

Highlighted Details

  • AI Voice Interaction: Integrates LLMs (Qwen/DeepSeek) with streaming ASR/TTS for natural conversation.
  • Music Playback: Dedicated self.music.play_song tool with lyrics display.
  • Extensive Hardware Support: Over 70 ESP32 boards (C3, S3, P4) are compatible.
  • Multi-language & Voice Features: Supports Chinese, English, Japanese, offline wake-up, and voiceprint recognition.
  • Connectivity & Control: Wi-Fi/4G, Websocket/MQTT+UDP, on-device and cloud MCP for diverse control scenarios.

Maintenance & Community

  • Contributors: Key contributors include blankbubblegumcandy, Silicon Spirit Creation Technology, and Xiao ShuangshuangMeow.
  • Community: Active engagement via QQ Group: 826072986.
  • Ecosystem: Several third-party open-source projects provide compatible servers and clients.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive for free use and commercial applications.

Limitations & Caveats

  • Hardware Support Contradiction: While ESP32-C3 is listed as a supported chip platform, the README explicitly states that development boards with the ESP32C3 chip are temporarily unsupported.
  • Configuration: Correct music playback functionality requires specific configuration within the Xiaozhi backend (accessible via xiaozhi.me) to enable the MPC tool.
Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.