verl  by volcengine

RL training library for LLMs

Created 10 months ago
13,460 stars

Top 3.7% on SourcePulse

GitHubView on GitHub
Project Summary

verl is a production-ready reinforcement learning (RL) training library for large language models (LLMs), designed for flexibility, efficiency, and scalability. It enables researchers and engineers to easily implement and train LLMs using various RL algorithms, integrating seamlessly with existing LLM infrastructure and supporting diverse hardware configurations.

How It Works

verl utilizes a hybrid-controller programming model that allows for flexible representation and efficient execution of complex post-training dataflows, simplifying the implementation of RL algorithms like GRPO and PPO. It decouples computation and data dependencies, facilitating integration with popular LLM frameworks such as FSDP, Megatron-LM, and vLLM. The library also supports flexible device mapping for efficient resource utilization across GPU clusters.

Quick Start & Requirements

  • Installation: pip install verl
  • Prerequisites: Python 3.8+, PyTorch 2.0+, Hugging Face Transformers, vLLM (>= v0.8.2 recommended), SGLang. GPU with CUDA support is highly recommended for efficient training.
  • Documentation: Quickstart, Programming Guide

Highlighted Details

  • Supports state-of-the-art throughput via integrations with LLM training and inference engines.
  • Features efficient actor model resharding with 3D-HybridEngine to reduce memory redundancy and communication overhead.
  • Compatible with Hugging Face Transformers and Modelscope Hub, supporting models like Qwen-2.5, Llama3.1, and Gemma2.
  • Offers a wide range of RL algorithms including PPO, GRPO, ReMax, DAPO, and more, with support for vision-language models (VLMs).
  • Scales up to 70B models and hundreds of GPUs, with experiment tracking via wandb, swanlab, mlflow, and tensorboard.

Maintenance & Community

  • Initiated by ByteDance Seed team and maintained by the verl community.
  • Active development with regular releases and presentations at major conferences (e.g., EuroSys, NeurIPS).
  • Community channels available via Twitter (@verl_project).
  • Roadmap available at GitHub Issues.

Licensing & Compatibility

  • Apache 2.0 License.
  • Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

  • Megatron-LM backend support for AMD GPUs is noted as "coming soon."
  • Users are advised to avoid vLLM version 0.7.x due to known bugs.
  • The project is actively developed, and breaking changes may occur, with a discussion thread available for tracking them.
Health Check
Last Commit

21 hours ago

Responsiveness

1 day

Pull Requests (30d)
227
Issues (30d)
179
Star History
1,111 stars in the last 30 days

Explore Similar Projects

Starred by Bryan Helmig Bryan Helmig(Cofounder of Zapier), Will Brown Will Brown(Research Lead at Prime Intellect), and
1 more.

ReCall by Agent-RL

1.2%
1k
RL framework for LLM tool use
Created 6 months ago
Updated 4 months ago
Starred by Evan Hubinger Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
1 more.

rl by pytorch

0.4%
3k
PyTorch library for reinforcement learning research
Created 3 years ago
Updated 2 days ago
Feedback? Help us improve.