LLM+VLM agent for robot spatio-temporal reasoning
Top 99.8% on sourcepulse
ReMEmbR enables robots to build and reason over long-horizon spatio-temporal memories using Large Language Models (LLMs) and Vision-Language Models (VLMs). It allows robots to answer complex queries about their environment and past experiences, such as navigation instructions or temporal event analysis. The target audience includes robotics researchers and developers working with embodied AI and memory-augmented systems.
How It Works
ReMEmbR integrates LLMs and VLMs with a persistent memory database, specifically MilvusDB, to store and retrieve spatio-temporal information. Memory items consist of captions (from VLMs), timestamps, and pose data. This approach allows for efficient querying and reasoning over extensive historical data, enabling more sophisticated robot behaviors and question-answering capabilities.
Quick Start & Requirements
VILA
, run ./vila_setup.sh
, install Ollama, activate conda environment, pip install -r requirements.txt
, and launch MilvusDB via bash launch_milvus_container.sh start
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
LICENSE.md
.Limitations & Caveats
The project notes a potential GLIBCXX
version error requiring GCC update. It also highlights that dependencies may download models or data, necessitating review of those components' licenses.
4 months ago
Inactive