champ  by fudan-generative-vision

Human image animation research paper using 3D parametric guidance

created 1 year ago
4,231 stars

Top 11.8% on sourcepulse

GitHubView on GitHub
Project Summary

Champ addresses controllable and consistent human image animation by leveraging 3D parametric guidance, targeting researchers and developers in computer vision and graphics. It enables users to animate human subjects in images based on motion sequences, offering a novel approach to character animation.

How It Works

Champ utilizes a diffusion model guided by 3D human pose and shape parameters (SMPL) derived from motion sequences. This approach allows for fine-grained control over the animation, ensuring consistency and realism by conditioning the generation process on explicit 3D representations of human movement. The use of SMPL parameters provides a robust and interpretable way to capture and transfer human motion.

Quick Start & Requirements

  • Install: pip install -r requirements.txt or poetry install --no-root
  • Prerequisites: Ubuntu 20.04/Windows 11, CUDA 12.1, Python 3.10. Tested GPUs: A100, RTX3090.
  • Setup: Requires downloading pretrained models and preparing guidance motions (SMPL & Rendering). Inference with a 250-frame motion requires ~20GB VRAM.
  • Links: Docs, Demo, ComfyUI Wrapper

Highlighted Details

  • ECCV 2024 paper.
  • Supports animation based on various guidance signals including depth, DWPose, normal maps, and semantic maps.
  • Offers scripts for SMPL & Rendering and Blender add-ons for motion processing.
  • Training code and sample datasets are released.

Maintenance & Community

  • Active development with recent releases of training code and sample data.
  • Community contributions include a ComfyUI wrapper and a Replicate demo.
  • Roadmap available for future developments.
  • Contact: siyuzhu@fudan.edu.cn for research opportunities.

Licensing & Compatibility

  • The repository does not explicitly state a license. The description mentions "Thanks to runwayml" and "Thanks to stablilityai" for StableDiffusion models, implying potential licensing considerations from those sources.

Limitations & Caveats

  • High VRAM requirement (~20GB) for longer sequences, with options to segment motions for lower VRAM GPUs.
  • Training requires custom data processing into SMPL & DWPose format.
  • The Gradio demo is listed as TBD on the roadmap.
Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
48 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.