This repository provides a framework for language-guided robot skill acquisition, enabling robots to learn new tasks from natural language instructions without expert demonstrations or manual supervision. It is designed for researchers and engineers in robotics and AI interested in efficient, scalable robot learning.
How It Works
The framework employs a data generation pipeline that leverages language models to create diverse, labeled robot trajectories. It utilizes a hierarchical approach with nested trajectories and exploration task trees to manage complexity. Seeded variation and language model queries are used to generate rich data, which is then used to train language-conditioned diffusion policies.
Quick Start & Requirements
- Install: The README does not provide a specific installation command but mentions the use of Hydra for configuration.
- Prerequisites: Ubuntu 18.04, 20.04, or 22.04; NVIDIA GPUs (tested on GTX 1080, RTX A6000, RTX 3080, RTX 3090).
- Resources: Requires significant computational resources for data generation and policy training.
- Links: Project Page, Arxiv, Video
Highlighted Details
- Language-guided data generation and diffusion policy training.
- No expert demonstrations, manual reward supervision, or manual language annotation required.
- Hierarchical actions and policies, exploration task trees, and seeded variation for data diversity.
- Utilizes language model queries for data labeling and control.
Maintenance & Community
- Supported by Google Research Award, NSF Awards #2143601 and #2132519.
- Mentions contributions from various individuals and projects, indicating active development and community engagement.
- Contact: huy [at] cs [dot] columbia [dot] edu for questions.
Licensing & Compatibility
- The repository does not explicitly state a license in the provided README snippet. Further investigation into the repository's files is required.
Limitations & Caveats
- The README does not detail specific limitations or known issues. The framework's complexity and reliance on language models may introduce challenges in robustness and interpretability.