IDE-3D  by MrTornado24

3D-aware portrait synthesis for interactive disentangled editing

Created 3 years ago
485 stars

Top 63.4% on SourcePulse

GitHubView on GitHub
Project Summary

IDE-3D addresses the trade-off between quality and editability in 3D-aware facial generation, enabling high-resolution, view-consistent, and disentangled portrait synthesis with interactive editing capabilities. It targets researchers and practitioners in computer graphics and generative AI who require fine-grained control over 3D face generation.

How It Works

The system employs a three-component architecture: a 3D-semantics-aware generative model for disentangled outputs, a hybrid GAN inversion for faithful reconstruction, and a canonical editor for semantic mask manipulation. This approach combines the strengths of low-resolution editability and high-resolution photorealism by leveraging semantic masks and a hybrid inversion technique for efficient, high-quality editing.

Quick Start & Requirements

  • Install via conda env create -f environment.yml.
  • Requires pre-trained checkpoints (ide3d-ffhq-64-512.pkl, encoder-base-hybrid.pkl).
  • FFHQ dataset processing is recommended, with instructions provided.
  • Interactive editing requires pip install -r ./Painter/requirements.txt.
  • Official project page: https://mrtornado24.github.io/IDE-3D/

Highlighted Details

  • State-of-the-art photorealism, faithfulness, and efficiency.
  • Supports free-view face drawing, editing, and style control.
  • Enables real portrait image editing via GAN inversion and interactive tools.
  • Facilitates semantic-guided style animation and CLIP-guided domain adaptation.

Maintenance & Community

The project is associated with ACM Transactions on Graphics (SIGGRAPH Asia 2022). Code is borrowed from StyleGAN3, PTI, EG3D, and StyleGAN-nada. Training scripts are noted as "will be released soon."

Licensing & Compatibility

The repository does not explicitly state a license. However, its academic publication and reliance on other projects (some with permissive licenses) suggest it is intended for research purposes. Commercial use would require careful review of any underlying component licenses.

Limitations & Caveats

Training scripts are not yet released, limiting the ability to train custom models. The project relies on pre-trained models and specific dataset formats, which may require significant effort to adapt for custom use cases.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Chaoyu Yang Chaoyu Yang(Founder of Bento), and
11 more.

IF by deep-floyd

0.0%
8k
Text-to-image model for photorealistic synthesis and language understanding
Created 2 years ago
Updated 1 year ago
Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Assaf Elovic Assaf Elovic(Cofounder of Tavily), and
2 more.

facechain by modelscope

0.1%
9k
AI toolchain for generating personalized digital-twin portraits
Created 2 years ago
Updated 3 months ago
Feedback? Help us improve.