rtc-aigc-embedded-demo  by volcengine

IoT demo for RTC AIGC integration on ESP32

created 6 months ago
296 stars

Top 90.6% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a demo for integrating Real-Time Communication (RTC) with AI-generated content (AIGC) on embedded devices, targeting developers working with IoT and AI applications. It showcases a real-time conversational AI experience powered by Volcengine's cloud services and Espressif hardware.

How It Works

The demo orchestrates a pipeline involving Volcengine's RTC, Speech Recognition (ASR), Text-to-Speech (TTS), and Ark large language models. An embedded device (ESP32-S3-Korvo-2) captures audio, sends it for ASR, processes the transcribed text with a large language model for a response, synthesizes the response into speech via TTS, and plays it back. The server component manages API interactions and configurations.

Quick Start & Requirements

  • Server:
    • Install dependencies: pip install requests
    • Configure RtcAigcConfig.py with Volcengine API keys (AK/SK), RTC AppID/AppKey, Ark EndpointId, TTS Voice Type, and ASR/TTS AppIDs/Access Tokens.
    • Run server: python3 RtcAigcService.py
  • Device (ESP32-S3):
    • Prerequisites: Linux server (Ubuntu 18.04+ recommended), Python 3.8+, Espressif ESP32-S3-Korvo-2 or AtomS3R board, CMake, Ninja, dfu-util.
    • Espressif ADF framework setup: Clone esp-adf, reset to specific commit 0d76650198ca96546c40d10a7ce8963bacdf820b, update submodules, run ./install.sh esp32s3, and . ./export.sh.
    • Clone demo into $ADF_PATH/examples.
    • Configure Config.h with server address and Volcengine parameters.
    • Apply patches for disabling Volcengine components and adding AtomS3R board support.
    • Compile: idf.py set-target esp32s3, idf.py menuconfig (set WiFi, board), idf.py build.
    • Flash & Monitor: idf.py flash, idf.py monitor.
  • Volcengine Services: Requires activation of RTC, Speech Recognition, Speech Synthesis, and Ark services.

Highlighted Details

  • Demonstrates end-to-end AIGC integration on resource-constrained embedded hardware.
  • Utilizes Volcengine's suite of AI and RTC services for a conversational experience.
  • Supports ESP32-S3-Korvo-2 and AtomS3R development boards.

Maintenance & Community

  • Project welcomes technical discussions via issues and community groups.

Licensing & Compatibility

  • The repository itself appears to be under a permissive license, but the underlying Espressif ADF framework has its own licensing. Specific Volcengine service usage is governed by Volcengine's terms.

Limitations & Caveats

  • The provided server example is for demonstration and quick testing only; production environments require a custom server implementation.
  • Requires specific Volcengine service configurations and API keys.
  • Strict adherence to Espressif ADF and IDF versions is necessary for device-side compilation.
Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
4
Issues (30d)
1
Star History
65 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Andre Zayarni Andre Zayarni(Cofounder of Qdrant), and
2 more.

RealChar by Shaunwei

0.1%
6k
Real-time AI character/companion creation and interaction codebase
created 2 years ago
updated 1 year ago
Feedback? Help us improve.