pixel2style2pixel by eladrich

Image-to-image translation framework using StyleGAN encoder

Created 5 years ago

3,268 stars

Top 14.6% on SourcePulse

View on GitHub

5 Experts Love This Project

Cofounder of Streamlit

Chuan Li

Chief Scientific Officer at Lambda

and 1 more!

Project Summary

The pixel2style2pixel (pSp) framework offers a novel encoder-based approach for image-to-image translation tasks, directly mapping input images to StyleGAN's latent space (W+). This method simplifies training by eliminating adversarial components and inherently supports multi-modal synthesis, making it suitable for researchers and practitioners working with StyleGAN and requiring flexible image translation.

How It Works

pSp utilizes a custom encoder network that generates style vectors directly fed into a pre-trained StyleGAN generator. This deviates from traditional "invert first, edit later" pipelines by treating translation as an encoding problem. This approach allows pSp to handle tasks without strict pixel-to-pixel correspondence and leverages StyleGAN's generative capabilities for multi-modal outputs through style-mixing.

Quick Start & Requirements

Installation: Clone the repository and install dependencies via environment/psp_env.yaml (Anaconda recommended).
Prerequisites: Linux/macOS, NVIDIA GPU with CUDA and CuDNN.
Inference: A Jupyter notebook (notebooks/inference_playground.ipynb) is provided for easy visualization and inference.
Pretrained Models: Downloadable models for StyleGAN inversion, face frontalization, sketch-to-face, segmentation-to-face, super-resolution, and toonification are available. Links to auxiliary models (StyleGAN, IR-SE50, MoCo, CurricularFace, MTCNN) are also provided.
Documentation: Detailed setup and usage instructions are available within the README.

Highlighted Details

Supports StyleGANs of various resolutions (256, 512, 1024).
Integrates MoCo-based similarity loss for non-facial domains.
Offers Weights & Biases integration for experiment tracking.
Enables multi-modal synthesis via style-mixing for conditional tasks and super-resolution.

Maintenance & Community

The project is the official implementation of a CVPR 2021 paper. Key contributors are listed in the README. Links to related projects and media mentions are provided.

Licensing & Compatibility

The core pSp code appears to be MIT licensed, consistent with its dependencies like StyleGAN2. However, the CUDA files within the StyleGAN2 ops directory are under the Nvidia Source Code License-NC, which may restrict commercial use or linking in closed-source projects.

Limitations & Caveats

CPU execution is not inherently supported. The CUDA files within the StyleGAN2 ops directory are under a non-commercial license, potentially impacting commercial applications.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days