richdreamer  by modelscope

Text-to-3D model for generating detailed 3D assets

created 1 year ago
465 stars

Top 66.2% on sourcepulse

GitHubView on GitHub
Project Summary

RichDreamer addresses the challenge of generating high-detail 3D assets from text prompts by leveraging a generalizable Normal-Depth diffusion model. It targets researchers and practitioners in computer graphics and AI who need to create detailed 3D content, offering improved realism and fidelity over existing text-to-3D methods.

How It Works

The core of RichDreamer is a diffusion model conditioned on both normal and depth maps, which are generated alongside the 3D representation. This approach allows the diffusion process to explicitly control and enhance geometric details and surface properties. The model can generate multi-view consistent normal and depth maps, which are then used to reconstruct a 3D mesh, potentially via NeRF or DMTet representations, leading to richer details.

Quick Start & Requirements

  • Install: Clone repository, create a conda environment (conda create -n rd, conda activate rd), and install dependencies (pip install -r requirements_3d.txt).
  • Docker: Pre-built Docker image available (registry.cn-hangzhou.aliyuncs.com/ailab-public/aigc3d) or build from docker/Dockerfile.
  • Prerequisites: Ubuntu 20.04, tested GPUs (RTX4090, A100). Requires downloading pretrained weights for Normal-Depth Diffusion Model and Albedo Diffusion Model, as well as Stable Diffusion and CLIP models.
  • Links: Project Page, Paper, YouTube, ModelScope Demo.

Highlighted Details

  • CVPR2024 Highlight paper.
  • Supports both NeRF and DMTet representations for 3D reconstruction.
  • Offers memory-saving options for single GPU (e.g., GTX-3090/4090) generation.
  • DMTet optimization benefits from higher rendering resolutions (e.g., 1024) for stability on single GPUs.

Maintenance & Community

The project is associated with ModelScope and Damo_XR_Lab. Links to related projects like normal-depth-diffusion and gobjaverse are provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README mentions that optimizing high-resolution DMTet spheres directly can be challenging and may require multiple GPUs, although a single GPU optimization trick is provided. Some dependencies might require specific versions or manual setup if not using Docker.

Health Check
Last commit

10 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
13 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
7 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
created 2 years ago
updated 1 year ago
Feedback? Help us improve.