Discover and explore top open-source AI tools and projects—updated daily.
MCG-NJUAI framework for harmonized human image animation
Top 58.5% on SourcePulse
Summary
SteadyDancer tackles human image animation challenges like spatio-temporal misalignments and identity drift. It provides a robust Image-to-Video framework for high-fidelity, coherent animations, offering superior visual quality and control with reduced training resources compared to prior methods.
How It Works
SteadyDancer uses an Image-to-Video (I2V) paradigm, unlike Reference-to-Video (R2V). This I2V approach inherently ensures first-frame preservation and employs Motion-to-Image Alignment. This design directly addresses spatial-structural inconsistencies and temporal start-gaps, preventing identity drift and artifacts common in R2V methods with real-world data.
Quick Start & Requirements
Installation requires cloning the repo, setting up a Python 3.10 Conda environment, and installing PyTorch 2.5.1 with CUDA 12.1, flash-attention, and xformers. Core dependencies are listed in requirements.txt; manual mmcv compilation from source may be necessary, requiring GCC 5.4+. Pre-trained weights for DW-Pose and SteadyDancer-14B must be downloaded from Hugging Face/ModelScope. Inference involves pose extraction (preprocess/pose_align.py) followed by animation generation (generate_dancer.py), supporting single-GPU and multi-GPU (FSDP + xDiT USP) modes. Key resources include the official paper and the X-Dance benchmark.
Highlighted Details
Maintenance & Community
The project welcomes community contributions and features integrations with WanGP and ComfyUI. Recent updates include GGUF weights, multi-GPU inference support, and the X-Dance benchmark release.
Licensing & Compatibility
The project is released under the permissive Apache-2.0 license, allowing for broad compatibility with commercial and closed-source applications.
Limitations & Caveats
Installation complexity, particularly for mmcv requiring manual compilation, presents a potential hurdle. Multi-GPU inference may yield non-deterministic results, impacting reproducibility. Early community integrations like ComfyUI might lack full feature parity, affecting performance.
2 weeks ago
Inactive