stargan-v2 by clovaai

PyTorch image-to-image translation for multiple domains (CVPR 2020)

Created 6 years ago

3,609 stars

Top 13.3% on SourcePulse

View on GitHub

2 Experts Love This Project

Cristóbal Valenzuela

Cofounder of Runway

Robin Rombach

Cofounder of Black Forest Labs

Project Summary

StarGAN v2 provides a PyTorch implementation for diverse image synthesis across multiple domains, addressing limitations in existing image-to-image translation models. It's designed for researchers and practitioners in computer vision and generative AI who need scalable and high-quality image translation capabilities.

How It Works

StarGAN v2 employs a single generator and discriminator architecture capable of translating images between multiple domains. It utilizes a mapping network to generate latent codes that control style variations and a style encoder to extract style information from reference images. This approach enables diverse image generation and efficient scalability across numerous domains within a unified framework.

Quick Start & Requirements

Install: Clone the repository and set up a Conda environment with specified PyTorch (1.4.0), torchvision (0.5.0), and CUDA (10.0) versions. Install additional dependencies via pip.
Prerequisites: Python 3.6.7, PyTorch 1.4.0, CUDA 10.0, Conda.
Data & Models: Download CelebA-HQ and AFHQ datasets and pre-trained networks using download.sh.
Usage: Scripts are provided for generating sample images, interpolation videos, and evaluating performance using FID and LPIPS.
Links: Paper: https://arxiv.org/abs/1912.01865, Video: https://youtu.be/0EVh5Ki4dIY

Highlighted Details

Achieves state-of-the-art results on CelebA-HQ and AFHQ datasets for image-to-image translation.
Introduces the AFHQ dataset for evaluating animal face image translation.
Supports both latent code-guided and reference image-guided style control.
Offers a TensorFlow implementation by a team member.

Maintenance & Community

The project is an official implementation from Clova AI (NAVER AI Lab). While there's no explicit mention of ongoing maintenance or community channels like Discord/Slack, the project is associated with a CVPR 2020 paper.

Licensing & Compatibility

The source code, pre-trained models, and dataset are released under the Creative Commons BY-NC 4.0 license. This license permits non-commercial use, modification, and distribution, provided appropriate credit is given and changes are indicated. Commercial use requires contacting clova-jobs@navercorp.com.

Limitations & Caveats

The project requires specific older versions of PyTorch (1.4.0) and CUDA (10.0), which may pose compatibility challenges with newer hardware and software stacks. The non-commercial license restricts its use in commercial products.

Health Check

Last Commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days