pixie by vlongle

3D physics prediction from pixels

Created 6 months ago

258 stars

Top 98.1% on SourcePulse

Project Summary

Summary

Pixie addresses the static nature of current 3D reconstructions (NeRF, Gaussian Splatting) by enabling physics prediction. It trains a neural network to map pretrained visual features (CLIP) to dense material fields of physical properties in a single forward pass. This allows for fast, generalizable physics inference and simulation, benefiting researchers and engineers seeking to integrate dynamic behaviors into 3D models.

How It Works

Pixie utilizes a feed-forward neural network to translate pretrained visual features, like CLIP embeddings, into dense material fields representing physical properties. This approach bypasses slow, scene-specific test-time optimization. By performing inference in a single forward pass, Pixie achieves rapid and generalizable physics prediction, integrating with 3D representations such as NeRF and 3DGS. Its novelty lies in directly predicting physical attributes from visual input for dynamic simulations.

Quick Start & Requirements

Installation involves cloning the repo, creating a Python 3.10 conda environment, and running pip install -e .. Key dependencies include PyTorch (specific CUDA versions), ninja, tiny-cuda-nn (source), nerfstudio, f3rm, pytorch3d, viser, tyro, vlmx, and flash-attn. Blender 4.3.2 with BlenderNeRF and gaussian-splatting-blender-addon is required, along with associated Python packages. VLM API keys are needed for some features. Training demands significant hardware: multiple high-VRAM GPUs (e.g., 6x RTX A6000), substantial CPU, and RAM.

Highlighted Details

Predicts 3D physics for 3D reconstructions using 3D Gaussian Splatting and NeRF.
Maps pretrained visual features (CLIP) to dense material fields via a single forward pass.
Enables fast, generalizable physics inference and simulation.
Integrates Vision-Language Models (VLMs) for data generation and material prediction.

Maintenance & Community

Authored by researchers from the University of Pennsylvania and MIT. No community channels (Discord, Slack) or explicit roadmap are detailed in the README.

Licensing & Compatibility

The repository's README does not specify a software license, creating ambiguity regarding usage rights, modification permissions, and compatibility for commercial or closed-source applications.

Limitations & Caveats

Installation is complex, involving numerous dependencies that require source compilation (e.g., tiny-cuda-nn, PhysGaussian submodules) and specific tool versions like Blender. The "Common Issues" section indicates potential build fragility and binary incompatibility. The absence of a stated license is a critical adoption blocker.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days