Discover and explore top open-source AI tools and projects—updated daily.
miccunifiVirtual try-on research paper using latent diffusion, textual inversion
Top 66.1% on SourcePulse
LaDI-VTON addresses the challenge of virtual try-on by leveraging latent diffusion models enhanced with textual inversion. It targets researchers and developers in e-commerce and metaverse applications, offering a novel approach to generate realistic images of models wearing specified garments.
How It Works
The core innovation is a latent diffusion model augmented with a custom autoencoder module featuring learnable skip connections. This design aims to preserve the wearer's characteristics during generation. To accurately represent garment textures and details, a textual inversion component maps garment features to CLIP token embeddings, creating pseudo-word tokens that condition the diffusion process.
Quick Start & Requirements
environment.yml.xformers, wandb.python src/inference.py with specified dataset and root paths.Highlighted Details
Maintenance & Community
The project is associated with ACM Multimedia 2023 and lists several academic contributors. Training code was released in September 2023.
Licensing & Compatibility
Limitations & Caveats
The non-commercial license restricts use in commercial products. Training involves multiple stages and requires significant dataset preparation and computational resources.
2 years ago
Inactive
zdou0830