Discover and explore top open-source AI tools and projects—updated daily.
hackers365High-performance AI backend for voice-driven IoT and edge devices
Top 97.0% on SourcePulse
This project provides a high-performance, full-streaming AI backend service written in Go, designed for IoT and smart voice applications. It integrates Automatic Speech Recognition (ASR), Large Language Models (LLM), and Text-to-Speech (TTS) capabilities, enabling low-latency, real-time AI voice interaction for smart terminals and edge devices. The service supports massive concurrency and multiple protocols, offering a flexible and scalable solution for developers.
How It Works
The core architecture features an end-to-end, full-streaming AI voice pipeline (ASR → LLM → TTS) for minimal latency. It employs a modular, pluggable design, abstracting transport layers (WebSocket, MQTT, UDP) and utilizing message queues for asynchronous LLM and TTS processing. The system leverages resource pooling and connection reuse for high throughput. It integrates diverse AI engines like FunASR, OpenAI-compatible models, Ollama, EdgeTTS, and CosyVoice through the Eino framework, allowing for flexible AI capability injection.
Quick Start & Requirements
The recommended installation is via a one-click startup package, available from the releases page, which includes the main program, console, and voiceprint service. Alternatively, Docker Compose or Docker deployments are supported. Local compilation requires Go 1.20+, Opus codec libraries (libopus0, libopusfile-dev), and ONNX Runtime (v1.21.0). A web console is accessible at http://<server_ip_or_domain>:8080 post-startup.
https://github.com/hackers365/xiaozhi-esp32-server-golang/releasesdoc/quickstart_bundle_tutorial.mdHighlighted Details
Maintenance & Community
The project is primarily maintained by "hackers365". Community interaction is facilitated via a WeChat group (QR code expired, direct contact recommended) and the author's personal WeChat. The roadmap indicates plans for establishing long connections with devices and implementing proactive AI features.
Licensing & Compatibility
The project is released under the MIT License, which is permissive for commercial use and integration into closed-source projects.
Limitations & Caveats
A security and permission system is currently in the planning phase. Access to community support may require direct contact with the author due to expired links. Local compilation has specific dependency requirements for Go and ONNX Runtime.
2 days ago
Inactive
openinterpreter