Fantasia3D  by Gorilla-Lab-SCUT

Research paper for text-to-3D content creation via disentangled geometry/appearance

created 2 years ago
772 stars

Top 46.1% on sourcepulse

GitHubView on GitHub
Project Summary

Fantasia3D is an ICCV 2023 project that enables high-quality text-to-3D content creation by disentangling geometry and appearance modeling. It targets researchers and practitioners in 3D generative AI, offering a flexible framework for generating detailed and realistic 3D assets from text prompts.

How It Works

Fantasia3D employs a disentangled representation approach, treating geometry and appearance generation as separate problems. This allows for explicit surface modeling, similar to methods like VolSDF, and enables the integration of BRDF material representations for photorealistic rendering. The geometry generation leverages normal and mask images as input to Stable Diffusion, with data augmentation techniques to improve alignment with text descriptions. Appearance modeling offers three strategies to mitigate issues like over-saturation and over-smoothing, aiming for improved realism and detail.

Quick Start & Requirements

  • Installation:
    • Option 1: pip install -r requirements.txt (potential package conflicts).
    • Option 2: docker pull registry.cn-guangzhou.aliyuncs.com/baopin/fantasia3d:1.0
  • Prerequisites: Ubuntu 20.04, Tested GPUs: RTX3090, RTX4090, A100, V100. xformers can be installed for acceleration.
  • Setup: Docker installation is recommended for quick deployment.
  • Resources: Official results were generated using 8x RTX 3090 GPUs. Multi-GPU training is recommended for optimal results. Single GPU training is tested for pineapple examples.
  • Links: Paper, Project Page, Video

Highlighted Details

  • Disentangled geometry and appearance modeling for improved control and realism.
  • Supports zero-shot and user-guided mesh generation.
  • Offers three strategies for appearance modeling to address common generation artifacts.
  • Provides extensive tips for parameter tuning and achieving better results.

Maintenance & Community

The project is associated with ICCV 2023. The README includes contribution guidelines and a naming convention for shared configurations. No specific community channels (Discord/Slack) or active development roadmap are mentioned.

Licensing & Compatibility

The repository does not explicitly state a license. The code is presented as the official repository for an academic paper, implying research-focused usage. Commercial use compatibility is not specified.

Limitations & Caveats

Achieving results comparable to the paper's official configurations may require 8 GPUs; performance on fewer GPUs is not guaranteed. The README notes that gradient accumulation for single GPU training is a planned feature but not yet implemented. Some generation strategies may require careful parameter tuning to avoid artifacts like over-saturation or strange colors.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.