Discover and explore top open-source AI tools and projects—updated daily.
nv-tlabsVideo diffusion for neural inverse and forward rendering
Top 89.9% on SourcePulse
Summary
DiffusionRenderer offers a framework for high-quality geometry and material estimation from videos (inverse rendering) and photorealistic synthesis from G-buffers (forward rendering). It targets computer vision and graphics researchers/engineers, providing a data-driven approach via video diffusion models to approximate light transport. This enables realistic relighting and material editing without explicit simulation, especially beneficial for imprecise or unavailable geometry.
How It Works
The system employs video diffusion models for both inverse (scene attribute estimation) and forward (image/video synthesis) rendering. It approximates light transport data-drivenly, generating realistic lighting effects without traditional path tracing or precise geometry. Trained on synthetic and auto-labeled real-world videos, this approach offers an advantage over classic PBR for scenarios with challenging geometry.
Quick Start & Requirements
Installation requires Python 3.10, PyTorch (v2.1-2.4) with CUDA 12.1, and project dependencies (pip install -r requirements.txt). Model weights are available on Hugging Face via download scripts. Inference scripts (inference_svd_rgbx.py for inverse, inference_svd_xrgb.py for forward) are provided, needing configuration files and input data. High-end GPUs (e.g., A100 80GB, RTX 4090 24GB) are recommended, with memory-saving options available.
Highlighted Details
Maintenance & Community
Recent updates include the "Cosmos DiffusionRenderer" and upcoming code releases. No specific community channels or detailed maintenance roadmaps are detailed in the provided README.
Licensing & Compatibility
The project uses the Nvidia Source Code License for its code/models, with the base model under the Stability AI Community License. Users must review both for compatibility, especially regarding commercial use or integration into closed-source applications, as restrictions may apply.
Limitations & Caveats
Inference demands substantial GPU memory (over 22 GB even with optimizations like FP16 and VAE chunking). CPU offloading is possible but not recommended due to performance impacts. The provided code is for the academic version; newer enhanced versions are in development.
4 months ago
Inactive
albarji