SadTalker by OpenTalker

Talking face animation from a single image and audio

Created 3 years ago

13,502 stars

Top 3.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

SadTalker generates talking head videos from a single portrait image and an audio file. It is designed for researchers and users interested in AI-driven animation, offering a method to create stylized, realistic facial movements synchronized with speech. The project provides a user-friendly interface and multiple integration options.

How It Works

SadTalker leverages a 3D motion coefficient learning approach. It maps audio features to 3D facial motion parameters, which are then used to animate a single input portrait. This method allows for the generation of realistic head movements and expressions driven by the provided audio, with an optional enhancer (GFPGAN) for improved visual quality.

Quick Start & Requirements

Installation: Clone the repository and install dependencies via requirements.txt. PyTorch 1.12.1 with CUDA 11.3 is recommended.
Prerequisites: Python 3.8, PyTorch 1.12.1+cu113, torchvision 0.13.1+cu113, torchaudio 0.12.1, ffmpeg. GPU with CUDA support is highly recommended for performance.
Models: Requires downloading pre-trained checkpoints for various components (mapping, expression, pose, facevid2vid, etc.) and GFPGAN weights.
Resources: Setup involves downloading models, which can be automated via a script. Local execution requires a Python environment and potentially a GPU.
Demos: Online demos are available via Hugging Face Spaces and a Colab notebook. A local Gradio WebUI can be run using app_sadtalker.py or webui.sh/webui.bat.
Docs: Best practices and configuration tips

Highlighted Details

CVPR 2023 paper accepted.
Offers multiple animation modes: Still, reference, and resize.
Integrated into a Stable Diffusion WebUI extension.
Available for free use on Discord.

Maintenance & Community

Active development with recent updates in June 2023.
Community support via Discord server.
Roadmap and updates tracked in GitHub issues.

Licensing & Compatibility

Licensed under Apache 2.0.
The license explicitly removes non-commercial restrictions.
The project disclaimer notes that Tencent names/logos are not authorized for use without permission.

Limitations & Caveats

The project's disclaimer states it is not an official Tencent product and users must comply with applicable laws and intellectual property rights. It prohibits use for harmful activities or violations of social ethics.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

90 stars in the last 30 days