lm-human-preferences  by openai

Code for fine-tuning language models using human preferences

Created 6 years ago
1,362 stars

Top 29.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides code for fine-tuning language models based on human preferences, as detailed in the paper "Fine-Tuning Language Models from Human Preferences." It targets researchers and engineers interested in aligning language model behavior with human feedback, enabling the training of reward models and subsequent policy fine-tuning.

How It Works

The project implements a reinforcement learning approach where a reward model is trained on human-labeled preference data. This reward model then guides the fine-tuning of a language model (policy) to generate outputs that maximize the predicted reward. The core advantage lies in directly optimizing for human-defined quality metrics, moving beyond traditional supervised learning objectives.

Quick Start & Requirements

  • Install: pipenv install
  • Prerequisites: Python 3.7.3, TensorFlow 1.13.1 (GPU version requires CUDA 10.0 and cuDNN 7.6.2), gsutil. Horovod is recommended for faster training.
  • Hardware: Tested on 8 V100 GPUs for training; development possible on macOS. CPU training is possible but very slow.
  • Data: Human labels are available at https://openaipublic.blob.core.windows.net/lm-human-preferences/labels.

Highlighted Details

  • Code supports training reward models from human labels and fine-tuning language models using these reward models.
  • Pre-trained models and human labels are released.
  • Supports distributed training via Horovod.
  • Includes scripts for training reward models, fine-tuning policies, and sampling from trained policies.

Maintenance & Community

The project is marked as "Archive" and no updates are expected. Pull requests are welcome.

Licensing & Compatibility

  • License: MIT
  • Compatibility: Suitable for commercial use.

Limitations & Caveats

The code is provided as-is and may no longer work due to migrated storage paths. It has only been tested with the smallest GPT-2 model (124M parameters) and Python 3.7.3.

Health Check
Last Commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

self-rewarding-lm-pytorch by lucidrains

0.1%
1k
Training framework for self-rewarding language models
Created 1 year ago
Updated 1 year ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

RL4LMs by allenai

0.0%
2k
RL library to fine-tune language models to human preferences
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.