AI-Digital-Human  by Jack-Cherish

AI digital human pipeline using open-source tools

Created 2 years ago
307 stars

Top 87.3% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a framework for creating AI-driven digital humans, targeting users interested in generating realistic virtual characters for various applications. It aims to simplify the process by integrating several open-source AI models for image enhancement, natural language processing, speech synthesis, and facial animation.

How It Works

The system orchestrates a pipeline of specialized AI models. Image super-resolution and face restoration are handled by CodeFormer. Large language model capabilities are provided by ChatGLM2-6B for generating text responses. Text-to-speech is achieved using vits, which can be fine-tuned with custom voice data. Finally, SadTalker drives facial animations on static images using the synthesized audio, creating a lip-synced digital human.

Quick Start & Requirements

Highlighted Details

  • Leverages CodeFormer for high-quality facial restoration.
  • Integrates ChatGLM2-6B for flexible conversational AI.
  • Supports custom voice training with vits for personalized speech synthesis.
  • Utilizes SadTalker for audio-driven facial animation.

Maintenance & Community

The project is actively under development, with the author planning a series of video tutorials and code releases. Community engagement details (e.g., Discord, Slack) are not yet specified.

Licensing & Compatibility

The project itself does not specify a license. However, it integrates several components with their own licenses:

  • CodeFormer: MIT License
  • ChatGLM2-6B: Apache License 2.0
  • vits: Not explicitly stated in the README, but the linked repository is under the MIT License.
  • SadTalker: Not explicitly stated in the README, but the linked repository is under the Apache License 2.0.
  • Gradio: Apache License 2.0 Compatibility for commercial use depends on the licenses of the individual integrated components.

Limitations & Caveats

The project is still in development, with a significant portion of the promised tutorials and installation packages yet to be released. Some components used in earlier demonstrations were non-open-source, and the current open-source replacements (ChatGLM2-6B, SadTalker) may result in slightly different output quality.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.