ReaLHF  by openpsi-project

Efficient RLHF training system for LLMs using parameter reallocation

created 1 year ago
305 stars

Top 88.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides ReaLHF, a distributed system for efficient Reinforcement Learning from Human Feedback (RLHF) training of Large Language Models (LLMs). It targets researchers and engineers working on LLM alignment, offering significantly higher training throughput and memory efficiency through its novel parameter reallocation technique.

How It Works

ReaLHF employs a parameter reallocation strategy, dynamically redistributing LLM parameters and adapting parallelization across a cluster during training. This approach optimizes resource allocation for each workload, leading to superior PPO training throughput compared to existing systems, especially as model size and GPU count increase. It also supports advanced RLHF algorithms and features like CUDAGraphs for high-throughput generation.

Quick Start & Requirements

  • Install from source: git clone ..., pip install -r requirements.txt, pip install git+https://github.com/NVIDIA/TransformerEngine.git@v1.8 --no-deps --no-build-isolation, pip install flash_attn==2.4.2 --no-build-isolation, pip3 install git+https://github.com/tgale96/grouped_gemm.git@v0.1.4 --no-build-isolation --no-deps, REAL_CUDA=1 pip install -e . --no-build-isolation.
  • GPU dependencies: NVIDIA TransformerEngine, FlashAttention, grouped_gemm.
  • Documentation: Documentation
  • Tutorial: Reproduce full RLHF with 4xLLaMA-7B in 30 minutes.

Highlighted Details

  • Achieves state-of-the-art training throughput for RLHF via parameter reallocation.
  • Supports large-scale SFT, reward modeling, DPO, PPO, and generation.
  • Integrates seamlessly with HuggingFace checkpoints and vLLM.
  • Offers flexibility with Hydra configuration and support for custom algorithms like GRPO.

Maintenance & Community

  • Development moved to AReaL.
  • WeChat group available for technical discussions.
  • Notable contributors from Tsinghua University and OpenPsi Inc.
  • References implementations from Megatron-LM and DeepSpeed.

Licensing & Compatibility

  • No license specified in the README.
  • Compatible with HuggingFace and vLLM.

Limitations & Caveats

This repository has been archived, with development moved to a new project (AReaL). The specific license for this archived version is not stated, which may impact commercial use or closed-source integration.

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
19 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

HALOs by ContextualAI

0.2%
873
Library for aligning LLMs using human-aware loss functions
created 1 year ago
updated 2 weeks ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

InternEvo by InternLM

1.0%
402
Lightweight training framework for model pre-training
created 1 year ago
updated 1 week ago
Starred by Lewis Tunstall Lewis Tunstall(Researcher at Hugging Face), Robert Nishihara Robert Nishihara(Cofounder of Anyscale; Author of Ray), and
4 more.

verl by volcengine

2.4%
12k
RL training library for LLMs
created 9 months ago
updated 16 hours ago
Feedback? Help us improve.