alpaca_farm by tatsu-lab

RLHF simulation framework for accessible instruction-following/alignment research

Created 2 years ago

840 stars

Top 42.4% on SourcePulse

View on GitHub

6 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Pawel Garbacki

Cofounder of Fireworks AI

and 2 more!

Project Summary

This repository provides AlpacaFarm, a simulation framework for developing and evaluating methods that learn from human feedback, such as RLHF. It targets researchers and developers in NLP and AI alignment, enabling them to iterate on feedback-based learning algorithms without the cost and complexity of collecting real human data.

How It Works

AlpacaFarm simulates pairwise preference data using large language models (like GPT-4) as automated annotators, mimicking human judgment with added noise for realism. It offers automated evaluation pipelines and reference implementations of key algorithms (PPO, Best-of-N, DPO, Expert Iteration) for instruction following and alignment research. This approach significantly reduces the cost and effort associated with developing these methods.

Quick Start & Requirements

Install: pip install alpaca-farm
Prerequisites: OpenAI API key (set as OPENAI_API_KEY environment variable). For optimizations like FlashAttention, install flash-attn and apex.
Data: Download from HuggingFace: https://huggingface.co/datasets/tatsu-lab/alpaca_farm
Further details and examples are available in the example notebook.

Highlighted Details

Supports simulation of preference data using GPT-4 and other LLMs.
Provides reference implementations for SFT, Reward Modeling, PPO, Best-of-N, Expert Iteration, Quark, and DPO.
Includes automated evaluation for benchmarking models against AlpacaEval.
Offers pre-trained checkpoints for various methods trained on simulated and human preferences.

Maintenance & Community

The project is associated with the Tatsu Lab at the University of Washington. The README notes a change in annotators from text-davinci-003 to GPT-4, impacting comparability with older results.

Licensing & Compatibility

The dataset and weight diffs are licensed under CC BY NC 4.0, restricting use to non-commercial, research purposes only.

Limitations & Caveats

The framework is licensed for research use only, prohibiting commercial applications. Recent results are not directly comparable to older benchmarks due to the switch to GPT-4 as the primary annotator. Training RLHF with PPO requires at least 8x 80GB A100 GPUs.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days