RoboBrain2.0  by FlagOpen

Embodied AI brain model for robotics

created 2 months ago
526 stars

Top 59.9% on SourcePulse

GitHubView on GitHub
Project Summary

RoboBrain 2.0 is an advanced open-source embodied AI system designed to unify perception, reasoning, and planning for complex robotic tasks. Targeting researchers and developers in embodied AI, it offers a powerful solution for creating generalist embodied agents capable of understanding and interacting with physical environments.

How It Works

RoboBrain 2.0 features a heterogeneous architecture combining a vision encoder with a large language model (LLM). It processes multi-modal inputs, including images, long videos, and structured scene graphs, alongside complex task instructions. The LLM decoder performs chain-of-thought reasoning to output structured plans, spatial relations, and coordinates, enabling capabilities like spatial understanding and temporal decision-making.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies using pip install -r requirements.txt within a conda environment (Python 3.10 recommended).
  • Prerequisites: Access to model checkpoints (e.g., BAAI/RoboBrain2.0-7B) from Hugging Face or ModelScope.
  • Demo: Inference examples for general tasks, visual grounding, affordance prediction, trajectory prediction, pointing, and navigation are provided in the README.

Highlighted Details

  • Supports 7B and 32B parameter variants, with a 3B version also available.
  • Achieves state-of-the-art performance on numerous spatial and temporal reasoning benchmarks, outperforming leading open-source and proprietary models.
  • Enables real-world embodied intelligence capabilities such as spatial understanding and temporal decision-making.
  • Training framework (FlagScale) and evaluation framework (FlagEvalMM) are highlighted.

Maintenance & Community

The project is associated with BAAI (Beijing Academy of Artificial Intelligence). Contact information via WeChat and RedNote is provided.

Licensing & Compatibility

The README does not explicitly state a license. Further investigation is required for commercial use or closed-source linking.

Limitations & Caveats

The README does not detail specific limitations or known issues. The project appears to be actively developed with recent updates in July 2025.

Health Check
Last commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
9
Star History
187 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.