Text-to-image model based on unCLIP architecture
Top 49.8% on sourcepulse
Karlo is a text-conditional image generation model that addresses the challenge of producing high-quality images from text prompts with improved detail recovery in fewer denoising steps. It is based on OpenAI's unCLIP architecture and is suitable for researchers and developers interested in advanced diffusion models.
How It Works
Karlo utilizes an unCLIP architecture comprising prior, decoder, and super-resolution modules. It features an enhanced super-resolution module that upscales images from 64px to 256px in just 7 reverse steps. This is achieved by first using a DDPM-trained SR module for initial upscaling and then a VQ-GAN-style loss fine-tuned module for recovering high-frequency details, offering an efficient approach to detail enhancement.
Quick Start & Requirements
pip install diffusers transformers accelerate safetensors
wget
commands or setup.sh
.python demo/product_demo.py
.Highlighted Details
diffusers
library.Maintenance & Community
diffusers
and Huggingface Spaces.contact@kakaobrain.com
for collaboration or feedback.Licensing & Compatibility
Limitations & Caveats
2 years ago
Inactive