SadTalker  by OpenTalker

Talking face animation from a single image and audio

created 2 years ago
13,045 stars

Top 3.9% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

SadTalker generates talking head videos from a single portrait image and an audio file. It is designed for researchers and users interested in AI-driven animation, offering a method to create stylized, realistic facial movements synchronized with speech. The project provides a user-friendly interface and multiple integration options.

How It Works

SadTalker leverages a 3D motion coefficient learning approach. It maps audio features to 3D facial motion parameters, which are then used to animate a single input portrait. This method allows for the generation of realistic head movements and expressions driven by the provided audio, with an optional enhancer (GFPGAN) for improved visual quality.

Quick Start & Requirements

  • Installation: Clone the repository and install dependencies via requirements.txt. PyTorch 1.12.1 with CUDA 11.3 is recommended.
  • Prerequisites: Python 3.8, PyTorch 1.12.1+cu113, torchvision 0.13.1+cu113, torchaudio 0.12.1, ffmpeg. GPU with CUDA support is highly recommended for performance.
  • Models: Requires downloading pre-trained checkpoints for various components (mapping, expression, pose, facevid2vid, etc.) and GFPGAN weights.
  • Resources: Setup involves downloading models, which can be automated via a script. Local execution requires a Python environment and potentially a GPU.
  • Demos: Online demos are available via Hugging Face Spaces and a Colab notebook. A local Gradio WebUI can be run using app_sadtalker.py or webui.sh/webui.bat.
  • Docs: Best practices and configuration tips

Highlighted Details

  • CVPR 2023 paper accepted.
  • Offers multiple animation modes: Still, reference, and resize.
  • Integrated into a Stable Diffusion WebUI extension.
  • Available for free use on Discord.

Maintenance & Community

  • Active development with recent updates in June 2023.
  • Community support via Discord server.
  • Roadmap and updates tracked in GitHub issues.

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • The license explicitly removes non-commercial restrictions.
  • The project disclaimer notes that Tencent names/logos are not authorized for use without permission.

Limitations & Caveats

The project's disclaimer states it is not an official Tencent product and users must comply with applicable laws and intellectual property rights. It prohibits use for harmful activities or violations of social ethics.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
3
Star History
445 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.