CLIPstyler by cyclomon

PyTorch implementation for text-guided image style transfer

Created 4 years ago

324 stars

Top 84.1% on SourcePulse

Project Summary

CLIPstyler provides the official PyTorch implementation for a CVPR 2022 paper on image style transfer using a single text condition. It enables users to apply artistic styles to images based on textual descriptions, offering a novel approach to text-guided image manipulation for researchers and artists.

How It Works

The core of CLIPstyler leverages CLIP (Contrastive Language–Image Pre-training) to bridge the gap between text and image domains. It uses a style transfer network that is conditioned on a text embedding, allowing for flexible and precise style application. This approach avoids the need for paired text-image data for training specific styles, relying instead on CLIP's general understanding of visual concepts and their textual representations.

Quick Start & Requirements

Install: conda create -n CLIPstyler python=3.6, conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0, pip install ftfy regex tqdm, conda install -c anaconda git, pip install git+https://github.com/openai/CLIP.git
Prerequisites: PyTorch 1.7.1, Python 3.6, CUDA 11.0, DIV2K dataset for fast style transfer, pre-trained VGG encoder/decoder models.
Demo: Google Colab notebooks are available for both training and testing.
Docs: https://arxiv.org/abs/2112.00374

Highlighted Details

Supports single-image style transfer with text conditioning.
Offers a "fast style transfer" mode requiring pre-trained models and datasets.
Enables style interpolation by combining decoder models.
Includes functionality for video style transfer.

Maintenance & Community

The project is associated with Gihyun Kwon and Jong Chul Ye. Links to the paper and citation details are provided. No specific community channels (Discord/Slack) or roadmap are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. The code is presented as the "Official Pytorch implementation," implying it may be tied to the research paper's terms. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The environment setup requires specific older versions of PyTorch (1.7.1) and Python (3.6), which may pose compatibility challenges with newer systems. Colab demonstrations are noted to have slow computation speeds. The fast style transfer requires downloading large datasets and pre-trained models.

CLIPstyler by cyclomon

Explore Similar Projects

B-LoRA by yardenfren1996

MMGEN-FaceStylor by open-mmlab

StyleKeeper by naver-ai

StyleShot by open-mmlab

MILS by facebookresearch

rich-text-to-image by songweige

InstantStyle by instantX-research

adaptive-style-transfer by CompVis

StyleGAN-nada by rinongal

Rerender_A_Video by williamyang1991

Qwen-Image by QwenLM

pixel2style2pixel by eladrich