CatVTON  by Zheng-Chong

Virtual try-on diffusion model research paper

Created 1 year ago
1,501 stars

Top 27.6% on SourcePulse

GitHubView on GitHub
Project Summary

CatVTON is a diffusion model for virtual try-on, designed for efficiency and ease of use. It targets researchers and developers in computer vision and fashion tech, offering a lightweight architecture for high-resolution image generation with reduced VRAM requirements.

How It Works

CatVTON leverages a diffusion model architecture, specifically building upon Stable Diffusion v1.5. Its novelty lies in a "concatenation" approach, enabling parameter-efficient training and simplified inference. This method allows for a total network size of 899.06M parameters, with only 49.57M trainable, and inference requiring less than 8GB VRAM for 1024x768 resolution.

Quick Start & Requirements

  • Install: pip install -r requirements.txt within a conda environment.
  • Prerequisites: Python 3.9.0, CUDA. Datasets like VITON-HD or DressCode are required for inference.
  • Deployment: ComfyUI workflow and Gradio app are available.
  • Docs: https://arxiv.org/abs/2407.15886

Highlighted Details

  • Accepted to ICLR 2025.
  • Supports 1024x768 resolution with < 8GB VRAM.
  • Parameter-efficient training (49.57M trainable parameters).
  • Mask-free version available.
  • Integrates with ComfyUI.

Maintenance & Community

  • Active development with recent updates (CatV2TON, FLUX.1-Fill-dev LoRA).
  • HuggingFace Space available.
  • Maintains an "Awesome-Try-On-Models" repository.

Licensing & Compatibility

  • Licensed under Creative Commons BY-NC-SA 4.0.
  • Non-commercial use only. Contributions must be shared under the same license.

Limitations & Caveats

The project is primarily tested on Linux; Windows users may encounter issues (refer to issue#8). The Gradio app is noted as not a stable version.

Health Check
Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
28 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
5 more.

ai-toolkit by ostris

0.9%
6k
Training toolkit for finetuning diffusion models
Created 2 years ago
Updated 14 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Clement Delangue Clement Delangue(Cofounder of Hugging Face), and
37 more.

diffusers by huggingface

0.3%
31k
PyTorch/Flax library for diffusion model research and applications
Created 3 years ago
Updated 14 hours ago
Feedback? Help us improve.