ctm  by sony

PyTorch implementation for a consistency trajectory model research paper

Created 2 years ago
325 stars

Top 83.7% on SourcePulse

GitHubView on GitHub
Project Summary

Consistency Trajectory Models (CTM) provides a PyTorch implementation for a novel diffusion model sampling technique that achieves state-of-the-art results on CIFAR-10 and ImageNet 64x64. It is designed for researchers and practitioners in generative modeling seeking to improve sample fidelity and control over the sampling process.

How It Works

CTM learns the probability flow Ordinary Differential Equation (ODE) trajectory of diffusion models. This approach allows for more diverse sampling options and a better balance between computational cost and sample quality compared to traditional diffusion sampling methods. The model offers flexibility in adjusting the sampling process to suit different computational budgets.

Quick Start & Requirements

  • Installation: Docker is the recommended installation method. Pull the latest image with docker pull dongjun57/ctm-docker:latest and create a container with GPU support and volume mounts for data and checkpoints.
  • Prerequisites: Requires PyTorch, TensorFlow (with CUDA), piq==0.7.0, joblib==0.14.0, albumentations==0.4.3, lmdb, CLIP, Pillow, flash-attn, xformers, mpi4py, nvidia-ml-py3, and timm==0.4.12. Access to ILSVRC2012 dataset and pre-trained diffusion models is necessary. Reference statistics for FID, sFID, IS, precision, and recall are also required.
  • Resources: Training requires significant computational resources and time (10k-50k iterations for CTM+DSM, >=30k for CTM+DSM+GAN). Evaluation requires >=50k samples.

Highlighted Details

  • Achieves SOTA FID of 1.73 on CIFAR-10 and 1.92 on ImageNet 64x64 for single-step sampling.
  • Offers diverse sampling options and balances computational budget with sample fidelity.
  • Official PyTorch implementation of the ICLR'24 paper "Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion".

Maintenance & Community

The project is associated with Sony and academic institutions (Stanford). Contact information for key researchers is provided. No explicit community channels (Discord/Slack) or roadmap are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. The code is provided as an official implementation, implying potential research-focused usage. Commercial use compatibility is not specified.

Limitations & Caveats

The setup process is complex, heavily relying on Docker and specific versions of numerous dependencies. The project requires substantial datasets (ILSVRC2012) and pre-trained models, along with specific reference statistics for evaluation. Custom dataset integration requires manual code modification.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Hanlin Tang Hanlin Tang(CTO Neural Networks at Databricks; Cofounder of MosaicML), and
1 more.

diffusion by mosaicml

0%
719
Diffusion model training code
Created 3 years ago
Updated 2 weeks ago
Starred by Benjamin Bolte Benjamin Bolte(Cofounder of K-Scale Labs), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
10 more.

consistency_models by openai

0.0%
6k
PyTorch code for consistency models research paper
Created 3 years ago
Updated 2 years ago
Feedback? Help us improve.