octo by octo-models

Robot policy for generalist manipulation, trained on 800k trajectories

Created 2 years ago

1,547 stars

Top 26.4% on SourcePulse

2 Experts Love This Project

pgarbacki

Cofounder of Fireworks AI

osanseviero

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

Octo provides a transformer-based generalist robotic policy (GRP) trained on 800k robot trajectories, enabling zero-shot control via language or goal images across diverse robot setups. It's designed for researchers and practitioners in robotics and AI who need a versatile, adaptable policy for various manipulation tasks.

How It Works

Octo employs a modular attention structure within its transformer backbone. This design allows it to process multimodal inputs (RGB cameras, language, goal images) and output robot actions. The modularity facilitates efficient finetuning on new robot morphologies, sensory inputs, and action spaces with minimal data and compute.

Quick Start & Requirements

Install via pip install -e . and pip install -r requirements.txt.
GPU users require jax[cuda11_pip] (version 0.4.20 specified).
TPU users require jax[tpu] (version 0.4.20 specified).
See installation details: Jax Github.
Test installation with python scripts/finetune.py --config.pretrained_path=hf://rail-berkeley/octo-small-1.5 --debug.

Highlighted Details

Supports multiple RGB camera inputs, various robot arms, and language/goal image instructions.
Achieves 13 it/sec inference on a 1x NVIDIA 4090 for the 93M parameter Octo-Base model.
Offers example scripts for inference, finetuning, rollouts, and real-robot evaluation.
Pretraining requires ~1.2TB of data and significant compute (TPUv4-128 pod for 8-14 hours).

Maintenance & Community

Developed by the Octo Model Team, including researchers from UC Berkeley.
Citation available for academic use.

Licensing & Compatibility

The repository does not explicitly state a license in the README.

Limitations & Caveats

Pretraining requires substantial data (1.2TB) and compute resources.
The README does not specify a license, which may impact commercial use or closed-source integration.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

23 stars in the last 30 days

Explore Similar Projects

Embodied-AI-Paper-TopConf by Songwxuan

Embodied AI paper list from top conferences

Created 11 months ago

Updated 2 months ago

Hybrid-VLA by PKU-HMI-Lab

Unified vision-language-action model

Created 11 months ago

Updated 4 months ago

Large-VLM-based-VLA-for-Robotic-Manipulation by JiuTian-VL

Advancing robotic manipulation with large Vision-Language-Action models

Created 9 months ago

Updated 2 months ago

OCRM_survey by RayYoh

A survey for embodied learning in object-centric robotic manipulation

Created 1 year ago

Updated 1 year ago

Being-H by BeingBeyond

Vision-language-action model for robot learning

Created 7 months ago

Updated 4 weeks ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera).

vla0 by NVlabs

State-of-the-art Vision-Language-Action models via text-based action representation

Created 4 months ago

Updated 4 days ago

embodied-agents by mbodiai

Integrate SOTA AI models into robotics

Created 1 year ago

Updated 2 months ago

X-VLA by 2toinf

Robotic control model using soft-prompted Transformers for cross-embodiment generalization

Created 5 months ago

Updated 2 weeks ago

CogACT by microsoft

Vision-language-action model for robotic manipulation

Created 1 year ago

Updated 3 months ago

Starred by

Phil Wang

Phil Wang(Prolific Research Paper Implementer).

lingbot-vla by Robbyant

Pragmatic Vision-Language-Action model for robotics

Created 1 month ago

Updated 2 weeks ago

Starred by

Benjamin Bolte

Benjamin Bolte(Cofounder of K-Scale Labs).

Awesome-Robotics-Foundation-Models by robotics-survey

Robotics survey paper resources

Created 2 years ago

Updated 1 year ago

RoboticsDiffusionTransformer by thu-ml

Diffusion Transformer for bimanual robot manipulation

Created 1 year ago

Updated 1 month ago

Feedback? Help us improve.