MotionClone  by LPengYang

Training-free framework for controllable video generation

created 1 year ago
502 stars

Top 62.8% on sourcepulse

GitHubView on GitHub
Project Summary

MotionClone is a training-free framework for controllable video generation that clones motion from reference videos. It targets researchers and developers in AI video generation, offering a flexible and efficient alternative to methods requiring model training or fine-tuning for motion transfer. The primary benefit is achieving diverse motion cloning across text-to-video and image-to-video tasks without complex video inversion.

How It Works

MotionClone leverages sparse temporal attention weights as motion representations. The core idea is that dominant components in attention maps drive motion synthesis, while others capture noise. By extracting these sparse weights through a single denoising step, the framework bypasses cumbersome inversion processes. This approach facilitates efficient and flexible motion transfer, enabling direct extraction of motion representations for guidance across various scenarios.

Quick Start & Requirements

  • Install: Clone the repository and create a conda environment using environment.yaml.
  • Prerequisites: Python 3.11.3 recommended. Requires downloading Stable Diffusion v1.5, community models (RealisticVision V5.1), AnimateDiff motion modules (v3_adapter_sd_v15.ckpt, v3_sd15_mm.ckpt), and SparseCtrl checkpoints (v3_sd15_sparsectrl_rgb.ckpt, v3_sd15_sparsectrl_scribble.ckpt).
  • Setup: Requires manual downloading and placement of multiple large model files.
  • Usage: Examples provided for text-to-video (camera/object motion) and image-to-video (sketch/RGB).
  • Links: Project Page (implied by demo link), arXiv

Highlighted Details

  • Training-free motion cloning for controllable video generation.
  • Utilizes sparse temporal attention weights for motion representation.
  • Bypasses video inversion for efficiency and flexibility.
  • Supports text-to-video, image-to-video, and sketch-to-video.
  • Claims superiority in motion fidelity, textual alignment, and temporal consistency.

Maintenance & Community

The project is associated with an ICLR 2025 submission. The code is released, and the authors welcome issues and questions. The project acknowledges contributions from AnimateDiff and FreeControl repositories.

Licensing & Compatibility

The repository does not explicitly state a license. The disclaimer notes that copyrights for demo images and audio belong to community users.

Limitations & Caveats

The setup requires manual downloading of several large model files, which can be time-consuming. The code is an initial release and may be subject to further optimization.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
22 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.