onju-voice  by justLV

Hackable AI home assistant platform using Google Nest Mini form factor

created 2 years ago
1,514 stars

Top 27.8% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a hackable AI home assistant platform, designed to replicate the form factor of a Google Nest Mini using a custom ESP32-S3 based PCB. It targets makers and developers interested in building custom voice assistants with local LLM capabilities, offering a flexible alternative to commercial smart speakers.

How It Works

The platform consists of a custom PCB with an ESP32-S3 microcontroller and a companion server. The ESP32-S3 handles audio capture and basic processing, while the server manages transcription (via local Whisper), LLM-based response generation (e.g., OpenAI), and Text-to-Speech (e.g., ElevenLabs). Audio data is streamed between the device and server using UDP and TCP.

Quick Start & Requirements

  • Server: pip install -r requirements.txt within the server directory. Requires Python, Whisper, OpenAI API key, and ElevenLabs API key. Configuration via config.yaml.
  • Firmware: Arduino IDE with ESP32 boards support. Requires Adafruit NeoPixel Library. WiFi credentials in credentials.h.
  • Hardware: Custom PCB (design files provided) or a breadboard setup with ESP32-S3 devboard, microphone, amplifier, speaker, and LED strip.
  • Home Assistant: Docker Compose instructions provided.
  • Maubot: Requires separate Maubot setup.
  • Demo: https://github.com/justLV/onju-voice/blob/main/docs/demo.md

Highlighted Details

  • Drop-in replacement PCB for Google Nest Mini (2nd gen).
  • Local Whisper for transcription and OpenAI/local LLMs for response generation.
  • Integrations with Home Assistant and Maubot (for messaging).
  • ESP32-S3 firmware programmable via Arduino IDE.
  • Server code runs on macOS, Linux, or Windows.

Maintenance & Community

The project is explicitly stated as "not being actively maintained," but all source code and design files are released for others to continue development.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is experimental and not a full replacement for commercial assistants. It lacks advanced features like Voice Activity Detection (VAD), Acoustic Echo Cancellation (AEC), and Blind Source Separation (BSS) on the device, as these are not fully supported by the Arduino IDE for ESP32. Conversation flow is serialized, and streaming responses are not implemented.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
53 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Andre Zayarni Andre Zayarni(Cofounder of Qdrant), and
2 more.

RealChar by Shaunwei

0.1%
6k
Real-time AI character/companion creation and interaction codebase
created 2 years ago
updated 1 year ago
Feedback? Help us improve.