PaLM-rlhf-pytorch  by lucidrains

RLHF implementation on PaLM

Created 2 years ago
7,866 stars

Top 6.6% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of Reinforcement Learning from Human Feedback (RLHF) applied to the PaLM architecture, aiming to replicate ChatGPT-like capabilities. It's targeted at researchers and developers interested in open-source LLM alignment and training.

How It Works

The project implements the RLHF pipeline, which involves training a base language model (PaLM), followed by a reward model trained on human preference data, and finally fine-tuning the language model using reinforcement learning (PPO) guided by the reward model. It leverages Flash Attention for efficiency and offers optional LoRA fine-tuning for the reward model.

Quick Start & Requirements

  • Install: pip install palm-rlhf-pytorch
  • Requirements: PyTorch, CUDA (for GPU acceleration), and potentially large datasets for training.
  • Usage examples are provided for training the base PaLM model, the reward model, and the RLHF trainer.

Highlighted Details

  • Implements RLHF for PaLM, similar to ChatGPT.
  • Includes optional LoRA fine-tuning for the reward model.
  • Leverages Flash Attention for improved performance.
  • Supports training a separate reward model and an RLHF trainer.

Maintenance & Community

The project is sponsored by Stability.ai and acknowledges contributions from Hugging Face and CarperAI. It mentions ongoing work and potential successors like Direct Preference Optimization. Community discussion channels are not explicitly linked.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is marked as "WIP" (work in progress). It explicitly states that no trained model is included, and significant compute resources are required for training. The effectiveness of LoRA fine-tuning for the reward model is noted as open research.

Health Check
Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
2
Star History
18 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen).

safe-rlhf by PKU-Alignment

0.1%
2k
Safe RLHF for constrained value alignment in LLMs
Created 2 years ago
Updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab).

Eureka by eureka-research

0.2%
3k
LLM-based reward design for reinforcement learning
Created 2 years ago
Updated 1 year ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
4 more.

simpleRL-reason by hkust-nlp

0.1%
4k
RL recipe for reasoning ability in models
Created 7 months ago
Updated 1 month ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
19 more.

trlx by CarperAI

0.0%
5k
Distributed RLHF for LLMs
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.