Discover and explore top open-source AI tools and projects—updated daily.
LLM-powered zero-shot navigation via 3D scene graphs
Top 97.9% on SourcePulse
Summary
SG-Nav addresses zero-shot object-goal navigation in 3D environments by constructing an online 3D scene graph to prompt Large Language Models (LLMs). This framework enables direct application to diverse scenes and object categories without requiring task-specific retraining, benefiting researchers and engineers in embodied AI and robotics.
How It Works
The core innovation lies in dynamically generating a 3D scene graph representation of the environment. This graph is then used to prompt LLMs, guiding them to understand spatial relationships and identify target objects for navigation. This online, graph-based prompting approach facilitates zero-shot generalization across unseen environments and object classes.
Quick Start & Requirements
Installation requires Python 3.9 and a conda environment. Key dependencies include habitat-sim==0.2.4
, habitat-lab
, PyTorch <=1.9
(specifically 1.9.1+cu111
), PyTorch3D, FAISS (faiss-gpu=1.8.0
), Grounding SAM, Grounding DINO, and GLIP models. Users must download Matterport3D datasets and manually replace a file within the habitat-sim
installation. Ollama is used for LLM inference, with llama3.2-vision
pre-pulled. CUDA 11.1 is implied. Setup involves significant data download and complex dependency management. Links to paper, project page, and video are provided.
Highlighted Details
Maintenance & Community
The README does not provide links to community channels (e.g., Discord, Slack), a roadmap, or specific details on ongoing maintenance beyond project acceptance announcements.
Licensing & Compatibility
The repository's license is not specified in the README. This absence makes it difficult to assess compatibility for commercial use or integration into closed-source projects without further inquiry.
Limitations & Caveats
The installation process is complex, involving specific library versions and manual file modifications, which may lead to fragility. The use of older PyTorch versions (<=1.9) could present compatibility challenges with modern hardware or software stacks. Demo loading times are noted as significant.
4 weeks ago
Inactive