Thought-Cloning by ShengranHu

Imitation learning framework for enhanced agent capability

Created 3 years ago

267 stars

Top 96.1% on SourcePulse

Project Summary

This repository provides the official implementation for Thought Cloning (TC), a novel imitation learning framework designed to enhance agent capabilities, AI safety, and interpretability by training agents to mimic human thought processes. It is targeted at researchers and developers in reinforcement learning and AI safety.

How It Works

Thought Cloning trains agents to predict human "thoughts" (intermediate reasoning steps) alongside actions, using a synthetic dataset of human demonstrations. This approach aims to imbue agents with a more human-like reasoning process, leading to improved performance and interpretability compared to standard imitation learning. The implementation leverages a Transformer encoder and an RNN decoder architecture.

Quick Start & Requirements

Install: Clone the repo, create a Python virtual environment, activate it, install PyTorch 1.7.1+ (with CUDA if available), and then run pip3 install --upgrade pip followed by pip3 install --editable ..
Prerequisites: Python >= 3.6, PyTorch >= 1.7.1, OpenAI Gym == 0.9.6, NumPy == 1.19.5, gym-minigrid == 1.0.0, blosc. Tested with Python 3.9.10 and PyTorch 1.7.1+cu110. Compatibility with newer versions of Gym, NumPy, or gym-minigrid is not guaranteed.
Setup: Requires downloading a synthetic thought dataset and trained model weights from Google Drive.
Links: Introduction Tweet Thread

Highlighted Details

Official implementation for the NeurIPS '23 Spotlight paper "Thought Cloning: Learning to Think while Acting by Imitating Human Thinking".
Implemented on the BabyAI 2D gridworld domain with a synthetic human thought dataset.
Includes scripts for reproducing synthetic thought datasets, training TC models, and evaluating zero-shot performance on out-of-distribution environments.

Maintenance & Community

Primary contributor: Shengran Hu.
Based on BabyAI 1.1, dan-visdial, and visdial-rl.

Licensing & Compatibility

The repository itself does not explicitly state a license. The underlying projects (BabyAI, dan-visdial, visdial-rl) have varying licenses, which may impose restrictions. Users should verify licensing for all components.

Limitations & Caveats

The code is specifically tested and potentially incompatible with older versions of key dependencies like OpenAI Gym, NumPy, and gym-minigrid.
Requires downloading large datasets and model weights from Google Drive.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days