DH_live  by kleinlee

Real-time digital human for mobile and web

created 1 year ago
1,593 stars

Top 26.8% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a real-time live streaming digital human solution, primarily focusing on DH_live_mini, designed for broad accessibility across mobile and web platforms without requiring GPUs. It targets users seeking an efficient, easy-to-deploy digital human avatar for applications like live streaming and real-time conversations.

How It Works

DH_live_mini utilizes a highly optimized approach, achieving a low single-frame compute power of 39 Mflops, significantly less than most mobile face detection algorithms. This efficiency allows it to run directly in mobile browsers and on CPUs, with a web resource package compressable to under 3MB. The system supports real-time dialogue, integrating Voice Activity Detection (VAD), Automatic Speech Recognition (ASR), Large Language Models (LLM), Text-to-Speech (TTS), and the digital human rendering pipeline.

Quick Start & Requirements

  • Install: conda create -n dh_live python=3.11, conda activate dh_live, pip install torch --index-url https://download.pytorch.org/whl/cu124 (or CPU version), pip install -r requirements.txt.
  • Prerequisites: Python 3.11, PyTorch (GPU or CPU), checkpoint files (BaiduDrive/GoogleDrive).
  • Setup: Requires downloading and unzipping checkpoint files. Video data preparation is handled by data_preparation_mini.py and data_preparation_web.py.
  • Demo: Run python web_demo/server.py to access localhost:8888/static/MiniLive.html.
  • Docs: bilibili video

Highlighted Details

  • Minimalist web resource package (under 2MB gzip).
  • No training required; ready to use out-of-the-box.
  • Supports real-time dialogue pipeline (VAD-ASR-LLM-TTS).
  • CPU support for DH_live_mini.

Maintenance & Community

  • Project primarily maintained as DH_live_mini.
  • WeChat and QQ groups available for community interaction.

Licensing & Compatibility

  • Licensed under the MIT License.
  • Permissive for commercial use and closed-source linking.

Limitations & Caveats

The original DH_live is no longer supported. Offline video synthesis is not supported on Linux/macOS. Audio file processing for demo_mini.py is not supported on Linux/macOS.

Health Check
Last commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
3
Star History
237 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

MiniCPM-o by OpenBMB

0.2%
20k
MLLM for vision, speech, and multimodal live streaming on your phone
created 1 year ago
updated 1 month ago
Feedback? Help us improve.