lm-human-preferences  by openai

Code for fine-tuning language models using human preferences

Created 6 years ago
1,374 stars

Top 29.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides code for fine-tuning language models based on human preferences, as detailed in the paper "Fine-Tuning Language Models from Human Preferences." It targets researchers and engineers interested in aligning language model behavior with human feedback, enabling the training of reward models and subsequent policy fine-tuning.

How It Works

The project implements a reinforcement learning approach where a reward model is trained on human-labeled preference data. This reward model then guides the fine-tuning of a language model (policy) to generate outputs that maximize the predicted reward. The core advantage lies in directly optimizing for human-defined quality metrics, moving beyond traditional supervised learning objectives.

Quick Start & Requirements

  • Install: pipenv install
  • Prerequisites: Python 3.7.3, TensorFlow 1.13.1 (GPU version requires CUDA 10.0 and cuDNN 7.6.2), gsutil. Horovod is recommended for faster training.
  • Hardware: Tested on 8 V100 GPUs for training; development possible on macOS. CPU training is possible but very slow.
  • Data: Human labels are available at https://openaipublic.blob.core.windows.net/lm-human-preferences/labels.

Highlighted Details

  • Code supports training reward models from human labels and fine-tuning language models using these reward models.
  • Pre-trained models and human labels are released.
  • Supports distributed training via Horovod.
  • Includes scripts for training reward models, fine-tuning policies, and sampling from trained policies.

Maintenance & Community

The project is marked as "Archive" and no updates are expected. Pull requests are welcome.

Licensing & Compatibility

  • License: MIT
  • Compatibility: Suitable for commercial use.

Limitations & Caveats

The code is provided as-is and may no longer work due to migrated storage paths. It has only been tested with the smallest GPT-2 model (124M parameters) and Python 3.7.3.

Health Check
Last Commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
7 more.

self-rewarding-lm-pytorch by lucidrains

0%
1k
Training framework for self-rewarding language models
Created 2 years ago
Updated 1 year ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

RL4LMs by allenai

0.0%
2k
RL library to fine-tune language models to human preferences
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.