stable-dreamfusion by ashawkey

Text-to-3D model using NeRF and diffusion

Created 3 years ago

8,793 stars

Top 5.9% on SourcePulse

View on GitHub

15 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Patrick von Platen

Author of Hugging Face Diffusers; Research Engineer at Mistral

Forrest Iandola

Author of SqueezeNet; Research Scientist at Meta

and 11 more!

Project Summary

This repository provides a PyTorch implementation of DreamFusion, a text-to-3D model leveraging Stable Diffusion. It enables generating 3D models from text prompts and images, with mesh export capabilities. The project targets researchers and developers in 3D content generation and AI art.

How It Works

The core approach replaces Imagen with Stable Diffusion, operating in latent space, which requires backpropagating through the VAE encoder. It utilizes a multi-resolution grid encoder (torch-ngp) for faster NeRF rendering (around 10 FPS at 800x800). Recent updates include support for Perp-Neg to mitigate multi-head issues in text-to-3D generation.

Quick Start & Requirements

Install: pip install -r requirements.txt
Pre-trained Models: Download Zero-1-to-3 (zero123-xl.ckpt) and Omnidata checkpoints (omnidata_dpt_depth_v2.ckpt, omnidata_dpt_normal_v2.ckpt).
Dependencies: PyTorch, diffusers, CUDA (optional for Taichi backend).
GPU: Recommended for performance; ~16GB VRAM for Instant-NGP backbone.
Docs: Advanced Tips

Highlighted Details

Supports Instant-NGP backbone for faster rendering and lower VRAM usage.
Offers a CUDA-free Taichi backend for NeRF.
Enables image-conditioned 3D generation using Zero-1-to-3.
Supports DMTet for mesh finetuning and export.

Maintenance & Community

The project is a work-in-progress with active development noted by recent updates. The primary contributor is Jiaxiang Tang.

Licensing & Compatibility

The repository itself is not explicitly licensed in the README. However, it depends on models and libraries with their own licenses (e.g., Stable Diffusion, Zero-1-to-3, diffusers). Users must adhere to the terms of these underlying components, particularly for commercial use.

Limitations & Caveats

The project is explicitly stated as a work-in-progress with quality not matching the original paper, and many prompts failing. Differences from the original DreamFusion paper are noted, primarily the use of Stable Diffusion instead of Imagen.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

15 stars in the last 30 days