Magic-Me by Zhen-Dong

Video diffusion for personalized clips

Created 1 year ago

462 stars

Top 65.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jiaming Song

Chief Scientist at Luma AI

Project Summary

Magic-Me is a framework for generating personalized videos featuring specific individuals, pets, or objects. It targets users who want to create custom video content with familiar faces, offering a unique alternative to generic text-to-video models. The primary benefit is the ability to generate high-quality, identity-consistent video clips from user-provided reference images.

How It Works

Magic-Me employs a customized diffusion model with three key components for robust identity preservation. An ID module, trained with prompt-to-segmentation on cropped identities, disentangles identity information from background noise for accurate token learning. A text-to-video (T2V) module utilizes a 3D Gaussian Noise Prior for enhanced inter-frame consistency. Additionally, video-to-video (V2V) modules for face and tiled video generation address face deblurring and video upscaling for higher resolution outputs.

Quick Start & Requirements

Install via conda env create -f environment.yaml after cloning the repository.
Requires Anaconda, git lfs, and checkpoints for Stable Diffusion 1.5.
Training requires 5-10 images of the target identity and a custom .yaml configuration.
Inference can be performed via a Hugging Face Spaces demo or a provided Colab notebook for ComfyUI.
Available pre-trained embeddings include 24 different characters.

Highlighted Details

Supports SD-XL integration (planned).
Integrates multi-prompt and Prompt Travel features.
Offers both T2V and V2V (Face, Tiled) customization.
Codebase built upon Tune-a-Video and AnimateDiff.

Maintenance & Community

The project is associated with authors Ze Ma, Daquan Zhou, and Zhen Dong. Further details on community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The project is released for academic use. The README does not specify a license, but the disclaimer suggests restrictions on commercial use and user-generated content liability.

Limitations & Caveats

The project is released for academic use, and contributors disclaim responsibility for user-generated content, emphasizing user liability and ethical use. Future features like pose/depth/stretch control and additional demo releases (Magic-Me Instant, Magic-Me Crowd) are listed as planned.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days