xiaozhi-esp32-music by Maggotxy

ESP32 AI robot firmware for voice interaction and music

Created 8 months ago

259 stars

Top 97.8% on SourcePulse

Project Summary

This project provides open-source firmware for the Xiaozhi AI robot, enabling music playback with lyrics display on ESP32-based hardware. Targeting hobbyists and developers, it enhances the robot's functionality by integrating AI voice interaction with media control, offering a flexible platform for custom projects and commercial applications under the permissive MIT license.

How It Works

The firmware leverages the Xiaozhi AI Chat Robot framework, utilizing large language models (LLMs) like Qwen or DeepSeek for voice interaction via the MCP protocol. It specifically implements a self.music.play_song tool for music playback, supporting OPUS audio codec and lyrics display. The architecture follows a streaming ASR + LLM + TTS pipeline, enabling on-device hardware control (volume, GPIO) and cloud-based extensions for smart home or PC operations.

Quick Start & Requirements

Installation: Beginners can flash pre-built firmware, defaulting to the xiaozhi.me server for free Qwen model access. For development, use VSCode with ESP-IDF (v5.4+) on Linux (preferred) or Windows.
Prerequisites: ESP-IDF SDK (v5.4+), C++ coding standards.
Hardware: Supports ESP32-C3, ESP32-S3, and ESP32-P4 chip platforms, with compatibility for over 70 specific development boards (e.g., M5Stack CoreS3, LILYGO T-Circle-S3).
Community: Join QQ Group: 826072986.

Highlighted Details

AI Voice Interaction: Integrates LLMs (Qwen/DeepSeek) with streaming ASR/TTS for natural conversation.
Music Playback: Dedicated self.music.play_song tool with lyrics display.
Extensive Hardware Support: Over 70 ESP32 boards (C3, S3, P4) are compatible.
Multi-language & Voice Features: Supports Chinese, English, Japanese, offline wake-up, and voiceprint recognition.
Connectivity & Control: Wi-Fi/4G, Websocket/MQTT+UDP, on-device and cloud MCP for diverse control scenarios.

Maintenance & Community

Contributors: Key contributors include blankbubblegumcandy, Silicon Spirit Creation Technology, and Xiao ShuangshuangMeow.
Community: Active engagement via QQ Group: 826072986.
Ecosystem: Several third-party open-source projects provide compatible servers and clients.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive for free use and commercial applications.

Limitations & Caveats

Hardware Support Contradiction: While ESP32-C3 is listed as a supported chip platform, the README explicitly states that development boards with the ESP32C3 chip are temporarily unsupported.
Configuration: Correct music playback functionality requires specific configuration within the Xiaozhi backend (accessible via xiaozhi.me) to enable the MPC tool.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

10 stars in the last 30 days

Explore Similar Projects

LingEcho-App by code-100-precent

An intelligent voice interaction platform for AI

Created 2 months ago

Updated 4 days ago

Starred by

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI).

S.A.T.U.R.D.A.Y by GRVYDEV

Vocal computing toolbox for building voice interfaces to LLMs

Created 2 years ago

Updated 2 years ago

alibabacloud-bailian-speech-demo by aliyun

Speech AI SDK demos for AlibabaCloud Bailian

Created 1 year ago

Updated 2 months ago

echokit_server by second-state

Open-source voice agent platform

Created 1 year ago

Updated 1 week ago

whisplay-ai-chatbot by PiSugar

Pocket AI assistant like a futuristic walkie-talkie

Created 9 months ago

Updated 2 days ago

onju-voice by justLV

Hackable AI home assistant platform using Google Nest Mini form factor

Created 2 years ago

Updated 1 year ago

esp-ai by wangzongming

ESP-AI: AI integration solution for hardware

Created 1 year ago

Updated 1 month ago

Esp32_VoiceChat_LLMs by MetaWu2077

ESP32 device for voice chat with LLMs

Created 1 year ago

Updated 1 year ago

py-xiaozhi by huangjunsen0406

Python voice client for AI assistant "Xiaozhi"

Created 1 year ago

Updated 1 month ago

mi-gpt by idootop

Voice assistant for integrating smart speakers with LLMs

Created 2 years ago

Updated 5 months ago

Starred by

Chaoyu Yang

Chaoyu Yang(Founder of Bento),

Nir Gazit

Nir Gazit(Cofounder of Traceloop), and

4 more.

pipecat by pipecat-ai

Open-source framework for building real-time voice and multimodal conversational AI agents

Created 2 years ago

Updated 17 hours ago

xiaozhi-esp32 by 78

ESP32 chatbot for AI hardware development

Created 1 year ago

Updated 6 days ago

Feedback? Help us improve.