CVPR 2024 paper for unified 6D pose estimation/tracking of novel objects
Top 20.1% on sourcepulse
FoundationPose offers a unified foundation model for 6D object pose estimation and tracking, supporting both model-based and model-free scenarios. It targets researchers and engineers in robotics and AR/VR who need to estimate object poses without object-specific fine-tuning, achieving state-of-the-art results on challenging benchmarks.
How It Works
The approach unifies model-based (CAD model required) and model-free (few reference images) setups using a neural implicit representation for novel view synthesis. This allows pose estimation modules to remain invariant across both setups. Strong generalizability is achieved through large-scale synthetic training, a transformer-based architecture, contrastive learning, and LLM-aided data generation.
Quick Start & Requirements
docker pull wenbowen123/foundationpose
). For CUDA 12.1 support, use shingarey/foundationpose_custom_cuda121:latest
. Conda installation is experimental.python run_demo.py
python run_linemod.py
and python run_ycb_video.py
for model-based; bundlesdf/run_nerf.py
followed by run_ycb_video.py
for model-free.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
5 months ago
1 week