prope  by liruilong940607

3D vision transformer positional encoding

Created 4 months ago
571 stars

Top 56.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository introduces PRoPE (Cameras as Relative Positional Encoding), a novel method for incorporating 3D geometric relationships between image tokens in multi-view transformers. It addresses the challenge of binding positional information in computer vision tasks, offering a simple and efficient alternative to existing approaches for applications like novel view synthesis.

How It Works

PRoPE leverages relative projective transformations to encode camera parameters, enabling transformers to understand the 3D spatial relationships between image patches. This approach is implemented as a drop-in replacement for standard scaled dot-product attention, directly integrating camera intrinsics and extrinsics into the attention mechanism.

Quick Start & Requirements

  • Install: Standalone, single-file implementations are available for JAX and PyTorch (prope/jax.py, prope/torch.py).
  • Prerequisites: PyTorch or JAX, viewmats (world-to-camera matrices), Ks (camera intrinsic matrices), image dimensions, and patch size.
  • Usage: Replace torch.nn.functional.scaled_dot_product_attention with prope_dot_product_attention, passing camera parameters and image/patch dimensions.
  • Demo: PyTorch example provided in the README.

Highlighted Details

  • Improves LVSM performance on Novel View Synthesis tasks.
  • Aims to improve UniMatch performance on Stereo Depth Estimation.
  • Offers simple, single-file implementations for JAX and PyTorch.

Maintenance & Community

No specific community channels or roadmap details are provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license.

Limitations & Caveats

The README does not detail specific limitations or known issues. Compatibility for commercial use or closed-source linking is not specified due to the lack of a stated license.

Health Check
Last Commit

3 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
7
Star History
47 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.