Discover and explore top open-source AI tools and projects—updated daily.
nianticlabsLearning-based Structure-from-Motion for scene reconstruction
Top 44.2% on SourcePulse
Summary
ACE0 addresses camera pose estimation for image collections by learning an implicit scene representation. It targets computer vision researchers and practitioners, offering a learning-based structure-from-motion approach that integrates with modern rendering pipelines like NeRF and Gaussian Splats.
How It Works
This project employs a learning-based structure-from-motion (SfM) pipeline estimating camera parameters via a multi-view consistent, implicit scene representation. It leverages a scene coordinate regression model and integrates DSAC* RANSAC for robust camera registration. The approach optionally incorporates RGB-D data and reconstruction priors (depth distribution, 3D diffusion models) for enhanced scene reconstruction.
Quick Start & Requirements
Installation uses a provided environment.yml for Conda, requiring PyTorch and C++ compilation for DSAC* bindings. Primary commands: conda env create -f environment.yml, conda activate ace0, and cd dsacstar && python setup.py install && cd ... A V100 Nvidia GPU is recommended; memory constraints can be mitigated with --training_buffer_cpu True. Docker support is available via docker-compose up -d.
Highlighted Details
eccv_2024_checkpoint Git tag ensures reproducibility.Maintenance & Community
Developed by Niantic, Inc. authors. Specific community channels or active maintenance signals beyond core development are not detailed in the provided README.
Licensing & Compatibility
Copyrighted by Niantic, Inc. 2024, with "Patent Pending" and "All rights reserved." This indicates a proprietary license, potentially restricting commercial use, redistribution, or integration into closed-source products. Users must consult the explicit license file for terms.
Limitations & Caveats
ACE0 assumes a single, shared focal length and principal point at the image center, lacking support for varying intrinsics or complex camera distortion models without significant implementation effort. Reconstruction from sparse views is challenging; performance favors dense scene coverage. Output poses are approximately metric and require similarity transform fitting for accurate ground truth comparison.
2 months ago
Inactive