Magic-Me  by Zhen-Dong

Video diffusion for personalized clips

created 1 year ago
454 stars

Top 67.5% on sourcepulse

GitHubView on GitHub
Project Summary

Magic-Me is a framework for generating personalized videos featuring specific individuals, pets, or objects. It targets users who want to create custom video content with familiar faces, offering a unique alternative to generic text-to-video models. The primary benefit is the ability to generate high-quality, identity-consistent video clips from user-provided reference images.

How It Works

Magic-Me employs a customized diffusion model with three key components for robust identity preservation. An ID module, trained with prompt-to-segmentation on cropped identities, disentangles identity information from background noise for accurate token learning. A text-to-video (T2V) module utilizes a 3D Gaussian Noise Prior for enhanced inter-frame consistency. Additionally, video-to-video (V2V) modules for face and tiled video generation address face deblurring and video upscaling for higher resolution outputs.

Quick Start & Requirements

  • Install via conda env create -f environment.yaml after cloning the repository.
  • Requires Anaconda, git lfs, and checkpoints for Stable Diffusion 1.5.
  • Training requires 5-10 images of the target identity and a custom .yaml configuration.
  • Inference can be performed via a Hugging Face Spaces demo or a provided Colab notebook for ComfyUI.
  • Available pre-trained embeddings include 24 different characters.

Highlighted Details

  • Supports SD-XL integration (planned).
  • Integrates multi-prompt and Prompt Travel features.
  • Offers both T2V and V2V (Face, Tiled) customization.
  • Codebase built upon Tune-a-Video and AnimateDiff.

Maintenance & Community

The project is associated with authors Ze Ma, Daquan Zhou, and Zhen Dong. Further details on community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The project is released for academic use. The README does not specify a license, but the disclaimer suggests restrictions on commercial use and user-generated content liability.

Limitations & Caveats

The project is released for academic use, and contributors disclaim responsibility for user-generated content, emphasizing user liability and ethical use. Future features like pose/depth/stretch control and additional demo releases (Magic-Me Instant, Magic-Me Crowd) are listed as planned.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.