pixel2style2pixel  by eladrich

Image-to-image translation framework using StyleGAN encoder

created 4 years ago
3,257 stars

Top 15.2% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

The pixel2style2pixel (pSp) framework offers a novel encoder-based approach for image-to-image translation tasks, directly mapping input images to StyleGAN's latent space (W+). This method simplifies training by eliminating adversarial components and inherently supports multi-modal synthesis, making it suitable for researchers and practitioners working with StyleGAN and requiring flexible image translation.

How It Works

pSp utilizes a custom encoder network that generates style vectors directly fed into a pre-trained StyleGAN generator. This deviates from traditional "invert first, edit later" pipelines by treating translation as an encoding problem. This approach allows pSp to handle tasks without strict pixel-to-pixel correspondence and leverages StyleGAN's generative capabilities for multi-modal outputs through style-mixing.

Quick Start & Requirements

  • Installation: Clone the repository and install dependencies via environment/psp_env.yaml (Anaconda recommended).
  • Prerequisites: Linux/macOS, NVIDIA GPU with CUDA and CuDNN.
  • Inference: A Jupyter notebook (notebooks/inference_playground.ipynb) is provided for easy visualization and inference.
  • Pretrained Models: Downloadable models for StyleGAN inversion, face frontalization, sketch-to-face, segmentation-to-face, super-resolution, and toonification are available. Links to auxiliary models (StyleGAN, IR-SE50, MoCo, CurricularFace, MTCNN) are also provided.
  • Documentation: Detailed setup and usage instructions are available within the README.

Highlighted Details

  • Supports StyleGANs of various resolutions (256, 512, 1024).
  • Integrates MoCo-based similarity loss for non-facial domains.
  • Offers Weights & Biases integration for experiment tracking.
  • Enables multi-modal synthesis via style-mixing for conditional tasks and super-resolution.

Maintenance & Community

The project is the official implementation of a CVPR 2021 paper. Key contributors are listed in the README. Links to related projects and media mentions are provided.

Licensing & Compatibility

The core pSp code appears to be MIT licensed, consistent with its dependencies like StyleGAN2. However, the CUDA files within the StyleGAN2 ops directory are under the Nvidia Source Code License-NC, which may restrict commercial use or linking in closed-source projects.

Limitations & Caveats

CPU execution is not inherently supported. The CUDA files within the StyleGAN2 ops directory are under a non-commercial license, potentially impacting commercial applications.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
25 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.