stargan-v2  by clovaai

PyTorch image-to-image translation for multiple domains (CVPR 2020)

Created 5 years ago
3,600 stars

Top 13.5% on SourcePulse

GitHubView on GitHub
Project Summary

StarGAN v2 provides a PyTorch implementation for diverse image synthesis across multiple domains, addressing limitations in existing image-to-image translation models. It's designed for researchers and practitioners in computer vision and generative AI who need scalable and high-quality image translation capabilities.

How It Works

StarGAN v2 employs a single generator and discriminator architecture capable of translating images between multiple domains. It utilizes a mapping network to generate latent codes that control style variations and a style encoder to extract style information from reference images. This approach enables diverse image generation and efficient scalability across numerous domains within a unified framework.

Quick Start & Requirements

  • Install: Clone the repository and set up a Conda environment with specified PyTorch (1.4.0), torchvision (0.5.0), and CUDA (10.0) versions. Install additional dependencies via pip.
  • Prerequisites: Python 3.6.7, PyTorch 1.4.0, CUDA 10.0, Conda.
  • Data & Models: Download CelebA-HQ and AFHQ datasets and pre-trained networks using download.sh.
  • Usage: Scripts are provided for generating sample images, interpolation videos, and evaluating performance using FID and LPIPS.
  • Links: Paper: https://arxiv.org/abs/1912.01865, Video: https://youtu.be/0EVh5Ki4dIY

Highlighted Details

  • Achieves state-of-the-art results on CelebA-HQ and AFHQ datasets for image-to-image translation.
  • Introduces the AFHQ dataset for evaluating animal face image translation.
  • Supports both latent code-guided and reference image-guided style control.
  • Offers a TensorFlow implementation by a team member.

Maintenance & Community

The project is an official implementation from Clova AI (NAVER AI Lab). While there's no explicit mention of ongoing maintenance or community channels like Discord/Slack, the project is associated with a CVPR 2020 paper.

Licensing & Compatibility

The source code, pre-trained models, and dataset are released under the Creative Commons BY-NC 4.0 license. This license permits non-commercial use, modification, and distribution, provided appropriate credit is given and changes are indicated. Commercial use requires contacting clova-jobs@navercorp.com.

Limitations & Caveats

The project requires specific older versions of PyTorch (1.4.0) and CUDA (10.0), which may pose compatibility challenges with newer hardware and software stacks. The non-commercial license restricts its use in commercial products.

Health Check
Last Commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
15 more.

taming-transformers by CompVis

0.1%
6k
Image synthesis research paper using transformers
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.