CatVTON by Zheng-Chong

Virtual try-on diffusion model research paper

Created 1 year ago

1,598 stars

Top 25.8% on SourcePulse

Project Summary

CatVTON is a diffusion model for virtual try-on, designed for efficiency and ease of use. It targets researchers and developers in computer vision and fashion tech, offering a lightweight architecture for high-resolution image generation with reduced VRAM requirements.

How It Works

CatVTON leverages a diffusion model architecture, specifically building upon Stable Diffusion v1.5. Its novelty lies in a "concatenation" approach, enabling parameter-efficient training and simplified inference. This method allows for a total network size of 899.06M parameters, with only 49.57M trainable, and inference requiring less than 8GB VRAM for 1024x768 resolution.

Quick Start & Requirements

Install: pip install -r requirements.txt within a conda environment.
Prerequisites: Python 3.9.0, CUDA. Datasets like VITON-HD or DressCode are required for inference.
Deployment: ComfyUI workflow and Gradio app are available.
Docs: https://arxiv.org/abs/2407.15886

Highlighted Details

Accepted to ICLR 2025.
Supports 1024x768 resolution with < 8GB VRAM.
Parameter-efficient training (49.57M trainable parameters).
Mask-free version available.
Integrates with ComfyUI.

Maintenance & Community

Active development with recent updates (CatV2TON, FLUX.1-Fill-dev LoRA).
HuggingFace Space available.
Maintains an "Awesome-Try-On-Models" repository.

Licensing & Compatibility

Licensed under Creative Commons BY-NC-SA 4.0.
Non-commercial use only. Contributions must be shared under the same license.

Limitations & Caveats

The project is primarily tested on Linux; Windows users may encounter issues (refer to issue#8). The Gradio app is noted as not a stable version.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

1

Star History

24 stars in the last 30 days

Explore Similar Projects

ControlLoRA by HighCWu

Lightweight network to control Stable Diffusion spatial information

Created 3 years ago

Updated 1 year ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI).

segmoe by segmind

Framework for dynamic Stable Diffusion Mixture of Experts, no training needed

Created 2 years ago

Updated 1 year ago

Starred by

Sindre Sorhus

Sindre Sorhus(Prolific OSS Developer).

swift-diffusion by liuliu

Single-file Stable Diffusion re-implementation for mobile deployment

Created 3 years ago

Updated 1 month ago

Starred by

Forrest Iandola

Forrest Iandola(Author of SqueezeNet; Research Scientist at Meta).

CCSR by csslc

Research paper for content-consistent super-resolution via diffusion models

Created 2 years ago

Updated 7 months ago

TeaCache by ali-vilab

Training-free caching approach for video diffusion model inference

Created 1 year ago

Updated 8 months ago

catvton-flux by nftblackmagic

Virtual try-on solution combining diffusion models with inpainting

Created 1 year ago

Updated 11 months ago

BLIP3o by JiuhaiChen

Unified multimodal model combining reasoning with generative diffusion

Created 10 months ago

Updated 2 months ago

Starred by

Lyumin Zhang

Lyumin Zhang(Author of ControlNet).

kohya-trainer by Linaqruf

Trainer for Stable Diffusion models, adapted for easier use

Created 3 years ago

Updated 1 year ago

Starred by

Jack Lukic

Jack Lukic(Author of Semantic UI),

Travis Fischer

Travis Fischer(Founder of Agentic), and

1 more.

SUPIR by Fanghua-Yu

Image restoration research paper for photo-realistic results

Created 2 years ago

Updated 9 months ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind),

Abubakar Abid

Abubakar Abid(Cofounder of Gradio), and

1 more.

sdnext by vladmandic

WebUI for AI generative image and video creation

Created 3 years ago

Updated 1 day ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera), and

5 more.

ai-toolkit by ostris

Training toolkit for finetuning diffusion models

Created 2 years ago

Updated 6 days ago

Starred by

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n),

Clement Delangue

Clement Delangue(Cofounder of Hugging Face), and

37 more.

diffusers by huggingface

PyTorch/Flax library for diffusion model research and applications

Created 3 years ago

Updated 1 day ago

Feedback? Help us improve.