priorMDM by priorMDM

PyTorch code for human motion diffusion as a generative prior

Created 2 years ago

508 stars

Top 61.5% on SourcePulse

Project Summary

PriorMDM provides official PyTorch implementations for human motion generation research, focusing on diffusion models as generative priors. It offers solutions for single-person long-sequence generation (DoubleTake), two-person interaction synthesis (ComMDM), and fine-grained motion control, targeting researchers and developers in computer graphics and animation.

How It Works

The project leverages diffusion models, a class of generative models known for their high-quality sample synthesis. It adapts these models for human motion, treating motion sequences as data points. The architecture likely involves a U-Net-like structure for denoising, conditioned on text or motion prefixes, enabling diverse motion generation tasks.

Quick Start & Requirements

Install: Setup conda environment (environment.yml), install dependencies (pip install), and download specific libraries (CLIP, smplx).
Prerequisites: Python 3.8, CUDA-capable GPU, ffmpeg, spaCy (en_core_web_sm).
Data: Requires HumanML3D, BABEL, and 3DPW datasets, along with SMPL body models. Links and download scripts are provided.
Pretrained Models: Downloadable checkpoints for DoubleTake, ComMDM, and fine-tuned control models.
Setup Time: Moderate, due to dataset and dependency downloads.
Docs: Webpage

Highlighted Details

Supports text-to-motion, motion completion, and fine-tuned control of specific body parts (e.g., wrist, foot).
Enables generation of long motion sequences and interactions between two people.
Includes scripts for training custom models and evaluating generated motions.
Provides functionality to render generated motions as SMPL meshes for integration into 3D software.

Maintenance & Community

The project is associated with ICLR 2024 and lists several academic contributors. Links to community resources are not explicitly provided in the README.

Licensing & Compatibility

License: MIT License.
Compatibility: Requires adherence to licenses of dependent libraries (CLIP, SMPL, SMPL-X, PyTorch3D) and datasets. Commercial use may be restricted by these underlying licenses.

Limitations & Caveats

The 3DPW dataset requires cleaning even after the provided processing. Some generation tasks have maximum motion lengths (e.g., 9.8 seconds for text-to-motion). The project relies on specific versions of external libraries, which might require careful dependency management.

Health Check

Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days