OpenAvatarChat by HumanAIGC-Engineering

Interactive digital human conversation on a single PC

Created 10 months ago

2,996 stars

Top 15.9% on SourcePulse

Project Summary

This project provides a modular, interactive digital human conversation system designed to run on a single PC, targeting developers and researchers in AI and virtual reality. It offers low-latency, multimodal conversations with customizable components, enabling flexible integration of various AI models for speech, language, and avatar rendering.

How It Works

The system employs a modular architecture, allowing users to swap components for Automatic Speech Recognition (ASR), Large Language Models (LLM), Text-to-Speech (TTS), and avatar rendering. It supports both a fully local mode using models like MiniCPM-o and a hybrid mode leveraging cloud APIs for LLM and TTS. This flexibility reduces system requirements and allows for diverse conversational experiences.

Quick Start & Requirements

Installation: Recommended to use uv for environment management. Install dependencies via uv sync --all-packages or mode-specific installs. Run via uv run src/demo.py --config <config_file.yaml>. Docker execution is also supported via ./build_and_run.sh --config <config_file.yaml>.
Prerequisites: Python >=3.10, <3.12. CUDA-enabled GPU with NVIDIA driver supporting CUDA >= 12.4. Unquantized MiniCPM-o requires >20GB VRAM; int4 quantized version reduces VRAM needs. Git LFS is required for submodules.
Resources: Local MiniCPM-o inference can achieve ~2.2s average response delay on an i9-13900KF with RTX 4090. CPU inference can reach up to 30 FPS.
Links: Demo, LiteAvatarGallery, LAM

Highlighted Details

Low-latency (avg. 2.2s) real-time digital human conversation.
Supports multimodal LLMs (text, audio, video).
Modular design for flexible component replacement.
Integrates LiteAvatar for 2D avatars and LAM for ultra-realistic 3D digital humans.
Offers multiple pre-set configurations for different model combinations.

Maintenance & Community

Active development with recent releases (v0.3.0 on 2025.04.18).
Community contributions acknowledged, with links to deployment tutorials.
Project is actively maintained by HumanAIGC-Engineering.

Licensing & Compatibility

The repository itself appears to be under a permissive license, but specific component licenses (e.g., for models like MiniCPM-o, CosyVoice) should be reviewed individually for commercial use restrictions.

Limitations & Caveats

CosyVoice local TTS on Windows requires a specific Conda installation workaround due to pynini compilation issues.
Using video input with MiniCPM-o can significantly increase VRAM consumption, potentially leading to OOM errors on lower-spec GPUs.
LAM avatar generation pipeline is noted as "not ready yet."

Health Check

Last Commit

3 weeks ago

Responsiveness

1 week

Pull Requests (30d)

4

Issues (30d)

4

Star History

122 stars in the last 30 days

Explore Similar Projects

ChatWaifu-marai by MuBai-He

Chatbot for creating a chatting waifu

Created 3 years ago

Updated 2 years ago

Stream-Omni by ictnlp

GPT-4o-like multimodal chatbot

Created 6 months ago

Updated 6 months ago

gpt-voice-conversation-chatbot by Adri6336

Voice chatbot for engaging spoken conversations with ChatGPT/GPT-4

Created 2 years ago

Updated 1 year ago

Starred by

Jeffrey Morgan

Jeffrey Morgan(Cofounder of Ollama).

june by mezbaul-h

Local chatbot for voice-assisted interactions, focused on privacy

Created 1 year ago

Updated 1 year ago

desktop-waifu by AlizerUncaged

AI assistant for desktop

Created 2 years ago

Updated 1 year ago

swift-realtime-openai by m1guelpf

Swift SDK for OpenAI's Realtime API, enabling multimodal conversations

Created 1 year ago

Updated 3 months ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

ChatdollKit by uezo

3D virtual assistant SDK for voice-enabled chatbots using 3D models

Created 5 years ago

Updated 1 month ago

aituber-kit by tegnike

Web app for chatting with AI characters

Created 2 years ago

Updated 1 day ago

mini-omni2 by gpt-omni

Omni-interactive model for multimodal understanding and real-time voice conversations

Created 1 year ago

Updated 1 year ago

py-gpt by szczyglis-dev

Desktop AI assistant for multimodal interaction with various LLMs

Created 2 years ago

Updated 4 days ago

ChatGPT-Telegram-Bot by yym68686

Telegram bot for AI chat, powered by various LLMs

Created 3 years ago

Updated 1 month ago

Linly-Talker by Kedreamix

Digital avatar conversational system

Created 2 years ago

Updated 10 months ago

Feedback? Help us improve.