diffusion_policy by real-stanford

Visuomotor policy learning via action diffusion (research paper)

Created 2 years ago

3,621 stars

Top 13.3% on SourcePulse

Project Summary

Diffusion Policy provides a framework for learning visuomotor policies using diffusion models, targeting researchers and engineers in robotics and reinforcement learning. It enables efficient training and evaluation of policies on both simulated and real-world robotic tasks, offering a structured approach to policy learning with state or image-based observations.

How It Works

The core of Diffusion Policy lies in its action-centric diffusion model, which learns to generate a sequence of actions conditioned on a history of observations. It employs a unified interface for tasks and methods, allowing for modularity and extensibility. The framework handles data normalization, policy inference, and training/evaluation orchestration through distinct components like Datasets, Policies, and Workspaces, abstracting away environment-specific details.

Quick Start & Requirements

Installation: Install via Conda using conda env create -f conda_environment.yaml or mamba env create -f conda_environment.yaml.
Prerequisites: Linux with NVIDIA GPU, Ubuntu 20.04, Mujoco dependencies (libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf), RealSense SDK, Spacemouse dependencies (libspnav-dev spacenavd).
Demo: Interactive Colab notebooks are available for state-based and vision-based environments.
Documentation: Project page and paper links are provided.

Highlighted Details

Supports both low-dimensional state inputs and high-dimensional image inputs.
Includes implementations for simulation benchmarks and real robot hardware (UR5).
Provides scripts for reproducing simulation results, training on real robot data, and evaluating pre-trained checkpoints.
Features a modular codebase structure for easily adding new tasks and methods.

Maintenance & Community

The project is associated with Columbia University and Toyota Research Institute. Links to experiment logs and further details are available on their website.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The macOS environment setup (conda_environment_macos.yaml) is noted as having incomplete support for benchmarks. The codebase structure, while flexible, involves code repetition between tasks and methods.

diffusion_policy by real-stanford

Explore Similar Projects

Awesome-Robotics-Diffusion by showlab

Embodied-AI-Paper-TopConf by Songwxuan

gym-ignition by robotology-legacy

Awesome-Generalist-Robots-via-Foundation-Models by JeffreyYH

aloha_sim by google-deepmind

DexGraspVLA by Psi-Robot

CogACT by microsoft

awesome-machine-learning-robotics by Phylliade

peract by peract

3D-Diffusion-Policy by YanjieZe

RLBench by stepjam

openpi by Physical-Intelligence