threestudio by threestudio-project

Framework for 3D content generation from text/images using 2D diffusion

Created 2 years ago

6,981 stars

Top 7.3% on SourcePulse

View on GitHub

8 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Luis Capelo

Cofounder of Lightning AI

Robin Rombach

Cofounder of Black Forest Labs

Christian Laforte

Distinguished Engineer at NVIDIA; Former CTO at Stability AI

and 4 more!

Project Summary

threestudio is a unified framework for generating 3D content from text or images, leveraging 2D text-to-image diffusion models. It aims to provide a flexible and extensible platform for researchers and developers to experiment with and implement various 3D generation techniques.

How It Works

The framework employs score distillation sampling (SDS) and related techniques to optimize 3D representations (like NeRFs or SDFs) based on guidance from pre-trained 2D diffusion models. It supports multiple 3D representations and guidance methods, allowing for diverse generation approaches. The modular design enables easy integration of new methods and customization of the generation pipeline.

Quick Start & Requirements

Installation: pip install -r requirements.txt (Python >= 3.8, PyTorch >= 1.12). Docker installation is also supported.
Hardware: NVIDIA GPU with at least 6GB VRAM, CUDA installed.
Dependencies: PyTorch, Ninja (recommended for CUDA extensions). DeepFloyd IF requires a Hugging Face login and license agreement.
Documentation: installation.md, DOCUMENTATION.md
Demo: HuggingFace Spaces, self-hosted service

Highlighted Details

Supports numerous state-of-the-art 3D generation methods including DreamFusion, Magic3D, ProlificDreamer, Zero-1-to-3, and Gaussian Splatting.
Offers a custom extension system for adding new methods.
Includes features like prompt debiasing, Perp-Neg, and VRAM optimization techniques.
Provides a Gradio web interface for easier experimentation.

Maintenance & Community

Active development with frequent updates and contributions from various researchers.
Discord server for community discussion.
GitHub Issues for bug reports and feature requests.

Licensing & Compatibility

The project itself appears to be under a permissive license, but specific models used (e.g., Stable Diffusion, DeepFloyd IF) have their own licenses that may restrict commercial use or require specific agreements.

Limitations & Caveats

Achieving high-quality results often requires significant VRAM and computational resources.
Some implementations are experimental or unofficial re-implementations of original papers.
Results can be sensitive to hyperparameters and may require extensive tuning.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

19 stars in the last 30 days