Video diffusion for personalized clips
Top 67.5% on sourcepulse
Magic-Me is a framework for generating personalized videos featuring specific individuals, pets, or objects. It targets users who want to create custom video content with familiar faces, offering a unique alternative to generic text-to-video models. The primary benefit is the ability to generate high-quality, identity-consistent video clips from user-provided reference images.
How It Works
Magic-Me employs a customized diffusion model with three key components for robust identity preservation. An ID module, trained with prompt-to-segmentation on cropped identities, disentangles identity information from background noise for accurate token learning. A text-to-video (T2V) module utilizes a 3D Gaussian Noise Prior for enhanced inter-frame consistency. Additionally, video-to-video (V2V) modules for face and tiled video generation address face deblurring and video upscaling for higher resolution outputs.
Quick Start & Requirements
conda env create -f environment.yaml
after cloning the repository.git lfs
, and checkpoints for Stable Diffusion 1.5..yaml
configuration.Highlighted Details
Maintenance & Community
The project is associated with authors Ze Ma, Daquan Zhou, and Zhen Dong. Further details on community channels or roadmaps are not explicitly provided in the README.
Licensing & Compatibility
The project is released for academic use. The README does not specify a license, but the disclaimer suggests restrictions on commercial use and user-generated content liability.
Limitations & Caveats
The project is released for academic use, and contributors disclaim responsibility for user-generated content, emphasizing user liability and ethical use. Future features like pose/depth/stretch control and additional demo releases (Magic-Me Instant, Magic-Me Crowd) are listed as planned.
1 year ago
Inactive