MVEdit  by Lakonik

PyTorch code for multi-view diffusion-based 3D generation research

Created 1 year ago
336 stars

Top 81.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides the official PyTorch implementation for "3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation" and "Generic 3D Diffusion Adapter Using Controlled Multi-View Editing." It enables high-quality 3D asset generation and editing through controlled multi-view diffusion, targeting researchers and developers in computer vision and graphics.

How It Works

The project leverages multi-view diffusion models to achieve geometry-consistent 3D generation. It acts as an adapter, integrating with existing diffusion pipelines to guide the generation process across multiple views, ensuring spatial coherence and high fidelity in the resulting 3D assets. The approach utilizes off-the-shelf models for optimization-based adapters, requiring no further training for this variant.

Quick Start & Requirements

  • Installation: Clone the repository and install dependencies via pip install -r requirements.txt. A conda environment with Python 3.10, PyTorch 2.1.2, and CUDA 12.1 is recommended. FFmpeg and x264 are optional for video export.
  • Prerequisites: Linux (Ubuntu 20+), CUDA Toolkit 11.8+, PyTorch 2.1+, FFmpeg, x264. Windows is supported with potential adjustments for packages like tiny-cuda-nn.
  • Inference: Run python app.py --unload-models to start the Gradio Web UI. A GPU with at least 24GB VRAM is required.
  • Resources: Initial model downloads can be extensive.
  • Links: Project page, Demo, Paper.

Highlighted Details

  • Implements geometry-consistent multi-view diffusion for high-quality 3D generation.
  • Offers a Gradio Web UI for accessible inference and API access.
  • Built upon numerous foundational libraries including SSDNeRF, Stable-DreamFusion, and Gaussian Splatting.
  • Integrates with Zero123++, IP-Adapter, TRACER, LoFTR, and Omnidata for enhanced capabilities.

Maintenance & Community

The project is associated with Stanford University, Apparate Labs, and UCSD. GRM-based 3D-Adapter models are pending release alongside GRM.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

GRM-based 3D-Adapters are not yet released. Certain packages may require specific configuration for Windows installation. API documentation may contain inaccuracies in data types and default values.

Health Check
Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
6 more.

threestudio by threestudio-project

0.2%
7k
Framework for 3D content generation from text/images using 2D diffusion
Created 2 years ago
Updated 9 months ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
13 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
Created 2 years ago
Updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
13 more.

pytorch3d by facebookresearch

0.2%
10k
PyTorch3D is a PyTorch library for 3D deep learning research
Created 5 years ago
Updated 3 days ago
Feedback? Help us improve.