alpaca_farm  by tatsu-lab

RLHF simulation framework for accessible instruction-following/alignment research

created 2 years ago
819 stars

Top 44.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides AlpacaFarm, a simulation framework for developing and evaluating methods that learn from human feedback, such as RLHF. It targets researchers and developers in NLP and AI alignment, enabling them to iterate on feedback-based learning algorithms without the cost and complexity of collecting real human data.

How It Works

AlpacaFarm simulates pairwise preference data using large language models (like GPT-4) as automated annotators, mimicking human judgment with added noise for realism. It offers automated evaluation pipelines and reference implementations of key algorithms (PPO, Best-of-N, DPO, Expert Iteration) for instruction following and alignment research. This approach significantly reduces the cost and effort associated with developing these methods.

Quick Start & Requirements

Highlighted Details

  • Supports simulation of preference data using GPT-4 and other LLMs.
  • Provides reference implementations for SFT, Reward Modeling, PPO, Best-of-N, Expert Iteration, Quark, and DPO.
  • Includes automated evaluation for benchmarking models against AlpacaEval.
  • Offers pre-trained checkpoints for various methods trained on simulated and human preferences.

Maintenance & Community

The project is associated with the Tatsu Lab at the University of Washington. The README notes a change in annotators from text-davinci-003 to GPT-4, impacting comparability with older results.

Licensing & Compatibility

The dataset and weight diffs are licensed under CC BY NC 4.0, restricting use to non-commercial, research purposes only.

Limitations & Caveats

The framework is licensed for research use only, prohibiting commercial applications. Recent results are not directly comparable to older benchmarks due to the switch to GPT-4 as the primary annotator. Training RLHF with PPO requires at least 8x 80GB A100 GPUs.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 90 days

Explore Similar Projects

Starred by Sebastian Raschka Sebastian Raschka(Author of Build a Large Language Model From Scratch), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
1 more.

direct-preference-optimization by eric-mitchell

0.5%
3k
Reference implementation for Direct Preference Optimization (DPO)
created 2 years ago
updated 11 months ago
Feedback? Help us improve.