LION  by nv-tlabs

Research paper for latent point diffusion models for 3D shape generation

created 2 years ago
806 stars

Top 44.7% on sourcepulse

GitHubView on GitHub
Project Summary

LION addresses the generation of 3D shapes using latent point diffusion models, targeting researchers and practitioners in computer graphics and AI. It offers a novel approach to 3D shape synthesis by leveraging diffusion models in a latent space, enabling high-quality and diverse point cloud generation.

How It Works

LION employs a two-stage process: first, a Variational Autoencoder (VAE) learns a compressed latent representation of 3D point clouds. Second, a diffusion model is trained in this latent space to generate new latent codes, which are then decoded by the VAE to produce 3D point clouds. This latent diffusion approach allows for efficient and high-fidelity generation compared to direct diffusion in point cloud space.

Quick Start & Requirements

  • Install via conda: conda env create --name lion_env --file=env.yaml followed by conda activate lion_env.
  • Additional dependencies: pip install git+https://github.com/openai/CLIP.git.
  • Requires CUDA 11.6.
  • Setup involves downloading ShapeNet data and released checkpoints.
  • Demo: python demo.py (requires checkpoint download).
  • Official Docs: Not explicitly linked, but the README provides detailed setup and training instructions.

Highlighted Details

  • Latent Point Diffusion Models for 3D Shape Generation.
  • Supports text-to-shape generation via CLIP embeddings.
  • Includes code for rendering point clouds using Mitsuba.
  • Provides scripts for training VAE, diffusion prior, and evaluation.

Maintenance & Community

  • Project initiated by researchers from NVIDIA and University of Toronto.
  • Primary contact for issues: @ZENGXH.
  • Experiment logging supported via comet-ml, wandb, and TensorBoard.

Licensing & Compatibility

  • The README does not explicitly state a license.
  • Code is provided for research purposes, implying potential restrictions on commercial use.

Limitations & Caveats

  • Released checkpoints and demo data were not immediately available at the time of the README's last update.
  • Training requires significant computational resources (e.g., multiple A100 or V100 GPUs).
  • Data paths for ShapeNet and rendered images may require customization.
Health Check
Last commit

10 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
18 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
7 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
created 2 years ago
updated 1 year ago
Feedback? Help us improve.