ReaLHF  by openpsi-project

Efficient RLHF training system for LLMs using parameter reallocation

Created 1 year ago
315 stars

Top 85.6% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides ReaLHF, a distributed system for efficient Reinforcement Learning from Human Feedback (RLHF) training of Large Language Models (LLMs). It targets researchers and engineers working on LLM alignment, offering significantly higher training throughput and memory efficiency through its novel parameter reallocation technique.

How It Works

ReaLHF employs a parameter reallocation strategy, dynamically redistributing LLM parameters and adapting parallelization across a cluster during training. This approach optimizes resource allocation for each workload, leading to superior PPO training throughput compared to existing systems, especially as model size and GPU count increase. It also supports advanced RLHF algorithms and features like CUDAGraphs for high-throughput generation.

Quick Start & Requirements

  • Install from source: git clone ..., pip install -r requirements.txt, pip install git+https://github.com/NVIDIA/TransformerEngine.git@v1.8 --no-deps --no-build-isolation, pip install flash_attn==2.4.2 --no-build-isolation, pip3 install git+https://github.com/tgale96/grouped_gemm.git@v0.1.4 --no-build-isolation --no-deps, REAL_CUDA=1 pip install -e . --no-build-isolation.
  • GPU dependencies: NVIDIA TransformerEngine, FlashAttention, grouped_gemm.
  • Documentation: Documentation
  • Tutorial: Reproduce full RLHF with 4xLLaMA-7B in 30 minutes.

Highlighted Details

  • Achieves state-of-the-art training throughput for RLHF via parameter reallocation.
  • Supports large-scale SFT, reward modeling, DPO, PPO, and generation.
  • Integrates seamlessly with HuggingFace checkpoints and vLLM.
  • Offers flexibility with Hydra configuration and support for custom algorithms like GRPO.

Maintenance & Community

  • Development moved to AReaL.
  • WeChat group available for technical discussions.
  • Notable contributors from Tsinghua University and OpenPsi Inc.
  • References implementations from Megatron-LM and DeepSpeed.

Licensing & Compatibility

  • No license specified in the README.
  • Compatible with HuggingFace and vLLM.

Limitations & Caveats

This repository has been archived, with development moved to a new project (AReaL). The specific license for this archived version is not stated, which may impact commercial use or closed-source integration.

Health Check
Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
25 more.

gpt-neox by EleutherAI

0.2%
7k
Framework for training large-scale autoregressive language models
Created 4 years ago
Updated 2 days ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
26 more.

ColossalAI by hpcaitech

0.1%
41k
AI system for large-scale parallel training
Created 3 years ago
Updated 15 hours ago
Feedback? Help us improve.