LayoutDiffusion by ZGCTroy

Diffusion model for layout-to-image generation

Created 3 years ago

318 stars

Top 84.9% on SourcePulse

Project Summary

LayoutDiffusion is a controllable diffusion model for generating images from layout specifications, targeting researchers and developers in computer vision and generative AI. It enables precise control over image composition by conditioning generation on spatial layouts, offering a novel approach to scene synthesis.

How It Works

LayoutDiffusion builds upon the guided-diffusion framework by incorporating a layout encoder (Layout Fusion Module - LFM) and object-aware cross-attention (OaCA). This allows the model to understand and integrate spatial layout information into the diffusion process, leading to more accurate and controllable image generation compared to unconditional diffusion models.

Quick Start & Requirements

Install: Use conda for environment setup, then pip install -e ./repositories/dpm_solver.
Prerequisites: Python 3.8, PyTorch 1.10.1, CUDA 11.3, omegaconf, opencv-python, gradio.
Demo: Run python scripts/launch_gradio_app.py with a specified config file.
Pretrained Models: Available for COCO-Stuff and VG datasets at various resolutions.
Docs: CVPR 2023 Paper

Highlighted Details

Achieves FID scores as low as 15.61 on COCO-Stuff 256x256.
Supports training on both latent and image spaces.
Includes a Gradio WebUI demo for easy interaction.
Provides evaluation scripts for FID, IS, DS, YOLO Score, and CAS.

Maintenance & Community

The project was accepted to CVPR 2023. The README indicates ongoing work on releasing pretrained latent space models. No specific community channels (Discord/Slack) are listed.

Licensing & Compatibility

The repository is based on openai/guided-diffusion, which is typically MIT licensed. However, specific licensing for LayoutDiffusion itself is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the license.

Limitations & Caveats

The README mentions the COCO-Stuff dataset is deprecated. The evaluation metrics (FID, IS, LPIPS, CAS) are noted as potentially confusing due to historical issues with related works, and the authors recommend newer benchmarks like LDM and Frido for beginners.

Health Check

Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days