Image completion research paper using transformers
Top 82.2% on SourcePulse
This repository provides the official PyTorch implementation of the Image Completion Transformer (ICT) for high-fidelity pluralistic image completion. It is designed for researchers and practitioners in computer vision and generative modeling who need to fill missing regions in images with realistic and diverse content. The ICT leverages transformers for their superior ability to understand shape and geometry compared to traditional CNN-based methods.
How It Works
The ICT employs a two-stage approach. First, a Transformer model generates a coarse, semantically plausible completion in a latent space. This is followed by a guided upsampling network that refines the completion to high resolution, ensuring fidelity and detail. The transformer's attention mechanism is key to capturing long-range dependencies and contextual information, crucial for accurate image completion.
Quick Start & Requirements
pip install -r requirements.txt
.png
format and masks binarized.Highlighted Details
Maintenance & Community
The repository is maintained by Ziyu Wan (@Raywzy). Contact is available via email: raywzy@gmail.com.
Licensing & Compatibility
The repository is for academic research use only. No specific license is mentioned, implying potential restrictions on commercial use.
Limitations & Caveats
The model is trained exclusively for 256x256 resolution. Masks require specific formatting (binarized, .png
). Pre-trained models are large and require manual download. The "academic research use only" clause may restrict commercial applications.
2 years ago
Inactive