xiaogpt  by yihong0618

Connect LLMs to Xiaomi AI Speakers

created 2 years ago
6,620 stars

Top 7.9% on sourcepulse

GitHubView on GitHub
Project Summary

This project enables users to interact with various Large Language Models (LLMs) like ChatGPT, Gemini, and Llama3 through Xiaomi AI speakers. It targets users who own Xiaomi AI speakers and want to leverage advanced AI capabilities without needing to purchase new hardware. The primary benefit is extending the functionality of existing smart speakers with cutting-edge LLM technology.

How It Works

The project acts as a bridge, intercepting commands sent to the Xiaomi AI speaker and routing them to the chosen LLM. It then converts the LLM's response back into a format the speaker can understand and deliver via Text-to-Speech (TTS). The core mechanism involves using the miservice_fork library to communicate with the speaker and API keys or cookies to authenticate with the LLMs. This approach avoids requiring root access on the speaker and allows for flexible LLM integration.

Quick Start & Requirements

  • Install: pip install miservice_fork
  • Prerequisites: Python 3.8+, Xiaomi AI speaker with network access, LLM API keys (e.g., OpenAI, Gemini), Xiaomi account credentials or cookie.
  • Setup: Requires obtaining the speaker's DID and setting environment variables for user, password, and DID. Detailed setup instructions and troubleshooting tips are available in the README.

Highlighted Details

  • Supports multiple LLMs including ChatGPT, New Bing, ChatGLM, Gemini, Doubao, Moonshot, 01, Llama3, and Tongyi Qianwen.
  • Offers various TTS options: default, Edge, OpenAI, Azure, Volc, Baidu, Google, and Fish.
  • Integrates with LangChain for enhanced capabilities like web search.
  • Supports configuration via config.yaml or command-line arguments, with CLI arguments taking precedence.

Maintenance & Community

The project is actively maintained, with contributions from various individuals. Users can find support and engage with the community via GitHub Issues.

Licensing & Compatibility

The project's licensing is not explicitly stated in the provided README snippet. Compatibility for commercial use or closed-source linking would require clarification on the license.

Limitations & Caveats

Some Xiaomi speaker models (e.g., LX04, X10A, L05B, L05C) may require the --use_command flag and only support the speaker's native TTS. WSL users may need to configure proxy settings. Xiaomi's account security measures (risk control) can sometimes lead to login failures, requiring the use of cookies or specific token handling.

Health Check
Last commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
96 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.