FoundationPose by NVlabs

CVPR 2024 paper for unified 6D pose estimation/tracking of novel objects

Created 2 years ago

2,788 stars

Top 16.9% on SourcePulse

Project Summary

FoundationPose offers a unified foundation model for 6D object pose estimation and tracking, supporting both model-based and model-free scenarios. It targets researchers and engineers in robotics and AR/VR who need to estimate object poses without object-specific fine-tuning, achieving state-of-the-art results on challenging benchmarks.

How It Works

The approach unifies model-based (CAD model required) and model-free (few reference images) setups using a neural implicit representation for novel view synthesis. This allows pose estimation modules to remain invariant across both setups. Strong generalizability is achieved through large-scale synthetic training, a transformer-based architecture, contrastive learning, and LLM-aided data generation.

Quick Start & Requirements

Installation: Docker is recommended (docker pull wenbowen123/foundationpose). For CUDA 12.1 support, use shingarey/foundationpose_custom_cuda121:latest. Conda installation is experimental.
Prerequisites: Python 3.9+, Eigen 3.4.0, NVDiffRast, PyTorch 2.0.0+ (with CUDA), Kaolin (for model-free).
Data: Requires downloading network weights, demo data, and optionally large-scale training data and preprocessed reference views.
Demo: python run_demo.py
Datasets: python run_linemod.py and python run_ycb_video.py for model-based; bundlesdf/run_nerf.py followed by run_ycb_video.py for model-free.
Links: Paper, Website, ROS Version, Demos

Highlighted Details

Achieved No. 1 on the world-wide BOP leaderboard for model-based novel object pose estimation (as of March 2024).
Outperforms specialized methods by a large margin and achieves comparable results to instance-level methods.
Supports instant application to novel objects without fine-tuning.
Leverages LLMs for synthetic data generation.

Maintenance & Community

Developed by NVlabs with contributors including Bowen Wen, Wei Yang, Jan Kautz, and Stan Birchfield.
Mentions NVIDIA Isaac Sim and Omniverse for support.
Contact: Bowen Wen.

Licensing & Compatibility

NVIDIA Source Code License.
Copyright © 2024, NVIDIA Corporation. All rights reserved.
Restrictions: Not explicitly stated, but NVIDIA Source Code License may have commercial use limitations.

Limitations & Caveats

The release does not include diffusion-based texture augmented data or weights due to legal restrictions, potentially leading to slight performance degradation.
Troubleshooting section points to potential issues with newer GPUs (e.g., 4090) and Windows setup.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

3

Star History

103 stars in the last 30 days

Explore Similar Projects

PonderV2 by OpenGVLab

3D pre-training framework for efficient 3D representations

Created 2 years ago

Updated 3 months ago

Awesome6DPoseEstimation by Jianqiuer

A curated collection of recent research on 6D pose estimation

Created 2 years ago

Updated 2 days ago

Awesome-Object-Pose-Estimation by CNJianLiu

Survey for deep learning-based object pose estimation

Created 1 year ago

Updated 2 months ago

EvoSkeleton by Nicholasli1995

Monocular 3D human pose estimation with evolutionary data

Created 5 years ago

Updated 4 years ago

acezero by nianticlabs

Learning-based Structure-from-Motion for scene reconstruction

Created 1 year ago

Updated 2 months ago

Rex-Omni by IDEA-Research

Multimodal LLM for versatile visual perception via next-point prediction

Created 3 months ago

Updated 1 day ago

EfficientPose by ybkscht

Efficient pose estimation implementation

Created 5 years ago

Updated 3 years ago

FFB6D by ethnhe

CVPR2021 paper for 6D pose estimation via bidirectional RGBD fusion

Created 4 years ago

Updated 3 years ago

craves.ai by zuoym15

Pose estimator for controlling a robotic arm with vision

Created 7 years ago

Updated 8 months ago

Starred by

Noah Snavely

Noah Snavely(Research Scientist at Google DeepMind; Professor at Cornell Tech).

DenseMatching by PruneTruong

PyTorch library for dense matching network research

Created 5 years ago

Updated 2 years ago

ViTPose by ViTAE-Transformer

PyTorch code for human/animal pose estimation research

Created 3 years ago

Updated 2 weeks ago

Starred by

Luca Antiga

Luca Antiga(CTO of Lightning AI) and

Kaichao You

Kaichao You(Core Maintainer of vLLM).

deep-high-resolution-net.pytorch by leoxiaobin

PyTorch SDK for human pose estimation

Created 6 years ago

Updated 1 year ago

Feedback? Help us improve.