GR00T-Dreams  by NVIDIA

Synthetic data generation for robot learning

Created 3 months ago
305 stars

Top 87.8% on SourcePulse

GitHubView on GitHub
Project Summary

GR00T Dreams is an NVIDIA initiative addressing the robotics data problem by generating synthetic trajectory data using world models, enabling robots to learn new tasks in unfamiliar environments without specific teleoperation data. This blueprint provides a full pipeline for DreamGen, utilizing Cosmos-Predict2 as the video world model.

How It Works

The project leverages NVIDIA Cosmos-Predict2, a video world model, to generate synthetic robot trajectory data. This data is prompted by a single image and language instructions. The pipeline includes fine-tuning the video world model, generating synthetic videos, extracting Inverse Dynamics Model (IDM) actions, fine-tuning on GR00T N1, and evaluating performance using the DreamGenBench. This approach aims to unlock generalization in robot learning by creating diverse, instruction-driven synthetic data.

Quick Start & Requirements

  • Installation: Follow cosmos-predict2-setup for environment setup.
  • Prerequisites: Requires NVIDIA hardware with CUDA. Specific embodiment scripts are available for Franka, GR1, SO-100, and RoboCasa.
  • Resources: Fine-tuning and inference of video world models can be resource-intensive.
  • Documentation: Detailed setup and training instructions are available in cosmos-predict2/documentations/training_gr00t.md.

Highlighted Details

  • Provides a full pipeline for generating synthetic robot videos and extracting actions.
  • Supports multiple robot embodiments for data generation and fine-tuning.
  • Includes evaluation scripts for Instruction Following (IF) and Physics Alignment (PA) using VLMs like Qwen2.5-VL and GPT-4o.
  • The benchmark evaluation uses a subset of videos and a relatively small open-source VLM, which may limit generalization to OOD scenarios.

Maintenance & Community

  • This is an NVIDIA research initiative. Specific community channels or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

  • The repository is released under a permissive license, allowing for research and commercial use. The specific license type is not explicitly stated but implied by NVIDIA's open-source contributions.

Limitations & Caveats

The benchmark evaluation protocol might not generalize well to out-of-distribution scenarios such as multi-view videos or detailed physics judgment due to the use of a limited dataset and a smaller VLM for evaluation.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
3
Star History
33 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.