SCube by nv-tlabs

Scene reconstruction research paper using voxels and splats

Created 1 year ago

517 stars

Top 60.8% on SourcePulse

Project Summary

SCube addresses large-scale 3D scene reconstruction, enabling instant generation of detailed and coherent scene representations. It targets researchers and engineers working with complex 3D environments, offering a novel approach to scene synthesis and reconstruction. The primary benefit is the ability to reconstruct and represent vast scenes efficiently.

How It Works

SCube employs a multi-stage generative approach. It first uses a VAE to encode scene geometry into a latent voxel representation, followed by a diffusion model for detailed geometry reconstruction. Finally, a Gaussian Splatting model (GSM) is used for appearance reconstruction. This cascaded approach allows for efficient handling of large-scale scenes by progressively refining the representation from coarse to fine details.

Quick Start & Requirements

Installation: Clone the repository and create a Conda environment using environment.yml.
Prerequisites: Python 3.x, Conda with conda-libmamba-solver, mmcv>=2.0.0, mmsegmentation>=1.0.0, and Weights & Biases (WandB) for logging. Waymo dataset (v1.4.2) is required for training and inference.
Data Processing: Requires significant processing time (over 1 day on 8x A100 GPUs) to convert Waymo TFRecords into WebDataset format, involving SegFormer for sky masks and Metric3Dv2 for GT depth.
Links: Project Page

Highlighted Details

Leverages a cascaded VAE-Diffusion-GSM pipeline for scene reconstruction.
Utilizes VoxSplats for efficient and high-fidelity scene representation.
Supports large-scale scene reconstruction, demonstrated with the Waymo dataset.
Offers inference for individual components (VAE, Diffusion, GSM) and a full pipeline.

Maintenance & Community

The project is from NVIDIA Toronto Labs, with related works including InfiniCube and XCube. No specific community links (Discord/Slack) are provided in the README.

Licensing & Compatibility

Licensed under the Nvidia Source Code License. This license may have restrictions on commercial use and distribution.

Limitations & Caveats

The data processing pipeline is computationally intensive and time-consuming. The project relies heavily on Weights & Biases for experiment tracking, and specific versions of MMCV might cause compatibility issues. The license type should be carefully reviewed for commercial applications.

SCube by nv-tlabs

Explore Similar Projects

sjc by pals-ttic

awesome-3DGS by qqqqqqy0227

GaussianCube by GaussianCube

autovfx by haoyuhsu

WonderWorld by KovenYu

lyra by nv-tlabs

GaussianDreamer by hustvl

GaussianObject by chensjtu

gaussian-splatting-lightning by yzslab

NSVF by facebookresearch

Make-It-3D by junshutang

zero123 by cvlab-columbia