Kandinsky-3 by ai-forever

Text-to-image diffusion model for multifunctional generative tasks

Created 2 years ago

395 stars

Top 73.0% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jiaming Song

Chief Scientist at Luma AI

Project Summary

Kandinsky 3.1 is a text-to-image diffusion model designed for high-quality, realistic image generation with enhanced features. It targets researchers and power users seeking advanced control and efficiency in AI-driven visual content creation, offering improvements over its predecessor, Kandinsky 3.0.

How It Works

Kandinsky 3.1 builds upon a latent diffusion architecture, incorporating a Flan-UL2 text encoder and a large U-Net. A key innovation is Kandinsky Flash, a distilled model using Adversarial Diffusion Distillation on latents for significantly faster inference (4 steps) without quality degradation. It also features prompt beautification via an LLM (Intel's neural-chat-7b-v3-1) and integrates IP-Adapter and ControlNet for image-conditional generation.

Quick Start & Requirements

Install via pip install -r requirements.txt after creating a conda environment.
Requires CUDA 11.1+ and PyTorch 1.10.1.
Example usage provided in ./examples Jupyter notebooks.
Official HuggingFace repository and project page links are available.

Highlighted Details

Kandinsky Flash offers 4-step inference, 3x faster than the base model.
Integrates prompt beautification using an LLM for improved text-to-image results.
Supports inpainting, image fusion, and image variations.
IP-Adapter and ControlNet enable image-conditional generation.

Maintenance & Community

The project is actively developed by a team including Vladimir Arkhipkin, Anastasia Maltseva, Andrei Filatov, and Igor Pavlov. Links to HuggingFace and a Telegram bot are provided for community engagement.

Licensing & Compatibility

The model weights are released under a permissive license, allowing for commercial use and integration into closed-source applications.

Limitations & Caveats

The initial installation instructions specify PyTorch 1.10.1+cu111, which is an older version and may require careful dependency management for compatibility with newer CUDA toolkits.

Health Check

Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days