sjc by pals-ttic

Research paper for 3D generation from 2D diffusion models

Created 3 years ago

521 stars

Top 60.3% on SourcePulse

3 Experts Love This Project

jiamings

Chief Scientist at Luma AI

parasj

Cofounder of Genmo

sxyu

Research Scientist at OpenAI; Cofounder of Luma AI

Project Summary

This repository implements Score Jacobian Chaining (SJC), a method for generating 3D assets by leveraging pretrained 2D diffusion models. It targets researchers and practitioners in computer vision and graphics interested in 3D generation from 2D priors, offering a novel approach to adapt powerful 2D models for 3D tasks.

How It Works

SJC applies the chain rule to a diffusion model's learned score function, backpropagating it through the Jacobian of a differentiable renderer (specifically, a voxel radiance field). This process aggregates 2D scores from multiple viewpoints into a unified 3D score, enabling 3D data generation using existing 2D models. A key innovation is a novel estimation mechanism to address the distribution mismatch inherent in this cross-domain adaptation.

Quick Start & Requirements

Install: Follow PyTorch installation for your CUDA version (e.g., pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116), then pip install -r requirements.txt. Manually install taming-transformers (git clone --depth 1 git@github.com:CompVis/taming-transformers.git && pip install -e taming-transformers).
Checkpoints: Download a 12GB tar file containing necessary checkpoints (SD v1.5, gddpm). Update env.json to point to the uncompressed files.
Usage: Run experiments from a dedicated directory (e.g., mkdir exp && cd exp). A sample generation command is python /path/to/sjc/run_sjc.py --sd.prompt "A zoomed out high quality photo of Temple of Heaven" --n_steps 10000 --lr 0.05 --sd.scale 100.0.
Resources: Generation takes ~25 minutes and 10GB GPU memory on an A5000 for 10,000 steps. High-resolution visualization requires ~5 minutes and 11GB on an A5000.
Docs: Usage examples and reproduction scripts are provided in the README.

Highlighted Details

Integrates with threestudio.
Includes implementations of Karras sampler and a voxel NeRF.
Offers a subpixel rendering script for higher quality visualizations.
Provides detailed example commands for generating various 3D assets (e.g., Trump, Temple of Heaven, School Bus).

Maintenance & Community

The project is associated with CVPR 2023.
Mentions integration into threestudio.
No specific community links (Discord/Slack) or active maintenance signals are provided in the README.

Licensing & Compatibility

Released under Stable Diffusion's OpenRAIL license due to its use of SD.
No other restrictive licensing components are identified.

Limitations & Caveats

Seeds are currently hardcoded to 0.
Scripts to reproduce 2D experiments (Fig 4) are pending.
Main paper figures are not yet consistent with appendix figures (which used subpixel rendering).
DreamBooth integration is noted as not ready, with potential issues like multi-face generation and guidance scale tuning.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

1 stars in the last 30 days

Explore Similar Projects

richdreamer by modelscope

Text-to-3D model for generating detailed 3D assets

Created 2 years ago

Updated 1 year ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

GaussianCube by GaussianCube

Research paper for 3D generative modeling using Gaussian splatting

Created 1 year ago

Updated 1 year ago

SCube by nv-tlabs

Scene reconstruction research paper using voxels and splats

Created 1 year ago

Updated 2 months ago

Paint3D by OpenTexture

Research paper for 3D mesh texturing via diffusion

Created 2 years ago

Updated 1 year ago

GaussianDreamer by hustvl

Framework for fast text-to-3D Gaussian generation

Created 2 years ago

Updated 1 year ago

LION by nv-tlabs

Research paper for latent point diffusion models for 3D shape generation

Created 3 years ago

Updated 1 year ago

Starred by

Amit Jain

Amit Jain(Cofounder of Luma AI),

Chuan Li

Chuan Li(Chief Scientific Officer at Lambda), and

2 more.

NSVF by facebookresearch

Research paper implementation for neural sparse voxel fields (NSVF)

Created 6 years ago

Updated 2 years ago

Make-It-3D by junshutang

3D creation from a single image using diffusion prior (ICCV 2023)

Created 2 years ago

Updated 1 year ago

Starred by

Alberto Taiuti

Alberto Taiuti(Cofounder of Luma AI) and

Saining Xie

Saining Xie(Professor at NYU).

zero123 by cvlab-columbia

Research paper for zero-shot one image to 3D object generation

Created 2 years ago

Updated 2 years ago

Starred by

Victor Taelin

Victor Taelin(Author of Bend, Kind, HVM).

DreamCraft3D by deepseek-ai

3D generator for high-fidelity object creation from a 2D reference image

Created 2 years ago

Updated 8 months ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Luis Capelo

Luis Capelo(Cofounder of Lightning AI), and

6 more.

threestudio by threestudio-project

Framework for 3D content generation from text/images using 2D diffusion

Created 2 years ago

Updated 1 year ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

13 more.

stable-dreamfusion by ashawkey

Text-to-3D model using NeRF and diffusion

Created 3 years ago

Updated 2 years ago

Feedback? Help us improve.