IoT demo for RTC AIGC integration on ESP32
Top 90.6% on sourcepulse
This project provides a demo for integrating Real-Time Communication (RTC) with AI-generated content (AIGC) on embedded devices, targeting developers working with IoT and AI applications. It showcases a real-time conversational AI experience powered by Volcengine's cloud services and Espressif hardware.
How It Works
The demo orchestrates a pipeline involving Volcengine's RTC, Speech Recognition (ASR), Text-to-Speech (TTS), and Ark large language models. An embedded device (ESP32-S3-Korvo-2) captures audio, sends it for ASR, processes the transcribed text with a large language model for a response, synthesizes the response into speech via TTS, and plays it back. The server component manages API interactions and configurations.
Quick Start & Requirements
pip install requests
RtcAigcConfig.py
with Volcengine API keys (AK/SK), RTC AppID/AppKey, Ark EndpointId, TTS Voice Type, and ASR/TTS AppIDs/Access Tokens.python3 RtcAigcService.py
esp-adf
, reset to specific commit 0d76650198ca96546c40d10a7ce8963bacdf820b
, update submodules, run ./install.sh esp32s3
, and . ./export.sh
.$ADF_PATH/examples
.Config.h
with server address and Volcengine parameters.idf.py set-target esp32s3
, idf.py menuconfig
(set WiFi, board), idf.py build
.idf.py flash
, idf.py monitor
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 days ago
Inactive