rtc-aigc-embedded-demo  by volcengine

IoT demo for RTC AIGC integration on ESP32

Created 8 months ago
303 stars

Top 88.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a demo for integrating Real-Time Communication (RTC) with AI-generated content (AIGC) on embedded devices, targeting developers working with IoT and AI applications. It showcases a real-time conversational AI experience powered by Volcengine's cloud services and Espressif hardware.

How It Works

The demo orchestrates a pipeline involving Volcengine's RTC, Speech Recognition (ASR), Text-to-Speech (TTS), and Ark large language models. An embedded device (ESP32-S3-Korvo-2) captures audio, sends it for ASR, processes the transcribed text with a large language model for a response, synthesizes the response into speech via TTS, and plays it back. The server component manages API interactions and configurations.

Quick Start & Requirements

  • Server:
    • Install dependencies: pip install requests
    • Configure RtcAigcConfig.py with Volcengine API keys (AK/SK), RTC AppID/AppKey, Ark EndpointId, TTS Voice Type, and ASR/TTS AppIDs/Access Tokens.
    • Run server: python3 RtcAigcService.py
  • Device (ESP32-S3):
    • Prerequisites: Linux server (Ubuntu 18.04+ recommended), Python 3.8+, Espressif ESP32-S3-Korvo-2 or AtomS3R board, CMake, Ninja, dfu-util.
    • Espressif ADF framework setup: Clone esp-adf, reset to specific commit 0d76650198ca96546c40d10a7ce8963bacdf820b, update submodules, run ./install.sh esp32s3, and . ./export.sh.
    • Clone demo into $ADF_PATH/examples.
    • Configure Config.h with server address and Volcengine parameters.
    • Apply patches for disabling Volcengine components and adding AtomS3R board support.
    • Compile: idf.py set-target esp32s3, idf.py menuconfig (set WiFi, board), idf.py build.
    • Flash & Monitor: idf.py flash, idf.py monitor.
  • Volcengine Services: Requires activation of RTC, Speech Recognition, Speech Synthesis, and Ark services.

Highlighted Details

  • Demonstrates end-to-end AIGC integration on resource-constrained embedded hardware.
  • Utilizes Volcengine's suite of AI and RTC services for a conversational experience.
  • Supports ESP32-S3-Korvo-2 and AtomS3R development boards.

Maintenance & Community

  • Project welcomes technical discussions via issues and community groups.

Licensing & Compatibility

  • The repository itself appears to be under a permissive license, but the underlying Espressif ADF framework has its own licensing. Specific Volcengine service usage is governed by Volcengine's terms.

Limitations & Caveats

  • The provided server example is for demonstration and quick testing only; production environments require a custom server implementation.
  • Requires specific Volcengine service configurations and API keys.
  • Strict adherence to Espressif ADF and IDF versions is necessary for device-side compilation.
Health Check
Last Commit

1 week ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
1
Star History
4 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.