IDE-3D  by MrTornado24

3D-aware portrait synthesis for interactive disentangled editing

created 2 years ago
484 stars

Top 64.3% on sourcepulse

GitHubView on GitHub
Project Summary

IDE-3D addresses the trade-off between quality and editability in 3D-aware facial generation, enabling high-resolution, view-consistent, and disentangled portrait synthesis with interactive editing capabilities. It targets researchers and practitioners in computer graphics and generative AI who require fine-grained control over 3D face generation.

How It Works

The system employs a three-component architecture: a 3D-semantics-aware generative model for disentangled outputs, a hybrid GAN inversion for faithful reconstruction, and a canonical editor for semantic mask manipulation. This approach combines the strengths of low-resolution editability and high-resolution photorealism by leveraging semantic masks and a hybrid inversion technique for efficient, high-quality editing.

Quick Start & Requirements

  • Install via conda env create -f environment.yml.
  • Requires pre-trained checkpoints (ide3d-ffhq-64-512.pkl, encoder-base-hybrid.pkl).
  • FFHQ dataset processing is recommended, with instructions provided.
  • Interactive editing requires pip install -r ./Painter/requirements.txt.
  • Official project page: https://mrtornado24.github.io/IDE-3D/

Highlighted Details

  • State-of-the-art photorealism, faithfulness, and efficiency.
  • Supports free-view face drawing, editing, and style control.
  • Enables real portrait image editing via GAN inversion and interactive tools.
  • Facilitates semantic-guided style animation and CLIP-guided domain adaptation.

Maintenance & Community

The project is associated with ACM Transactions on Graphics (SIGGRAPH Asia 2022). Code is borrowed from StyleGAN3, PTI, EG3D, and StyleGAN-nada. Training scripts are noted as "will be released soon."

Licensing & Compatibility

The repository does not explicitly state a license. However, its academic publication and reliance on other projects (some with permissive licenses) suggest it is intended for research purposes. Commercial use would require careful review of any underlying component licenses.

Limitations & Caveats

Training scripts are not yet released, limiting the ability to train custom models. The project relies on pre-trained models and specific dataset formats, which may require significant effort to adapt for custom use cases.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
1 more.

EditAnything by sail-sg

0.0%
3k
Image editing research paper using segmentation and diffusion
created 2 years ago
updated 5 months ago
Feedback? Help us improve.