verl by volcengine

RL training library for LLMs

Created 1 year ago

18,205 stars

Top 2.5% on SourcePulse

View on GitHub

19 Experts Love This Project

Woosuk Kwon

Coauthor of vLLM

Jeff Hammerbacher

Cofounder of Cloudera

Yiran Wu

Coauthor of AutoGen

Luis Capelo

Cofounder of Lightning AI

and 15 more!

Project Summary

verl is a production-ready reinforcement learning (RL) training library for large language models (LLMs), designed for flexibility, efficiency, and scalability. It enables researchers and engineers to easily implement and train LLMs using various RL algorithms, integrating seamlessly with existing LLM infrastructure and supporting diverse hardware configurations.

How It Works

verl utilizes a hybrid-controller programming model that allows for flexible representation and efficient execution of complex post-training dataflows, simplifying the implementation of RL algorithms like GRPO and PPO. It decouples computation and data dependencies, facilitating integration with popular LLM frameworks such as FSDP, Megatron-LM, and vLLM. The library also supports flexible device mapping for efficient resource utilization across GPU clusters.

Quick Start & Requirements

Installation: pip install verl
Prerequisites: Python 3.8+, PyTorch 2.0+, Hugging Face Transformers, vLLM (>= v0.8.2 recommended), SGLang. GPU with CUDA support is highly recommended for efficient training.
Documentation: Quickstart, Programming Guide

Highlighted Details

Supports state-of-the-art throughput via integrations with LLM training and inference engines.
Features efficient actor model resharding with 3D-HybridEngine to reduce memory redundancy and communication overhead.
Compatible with Hugging Face Transformers and Modelscope Hub, supporting models like Qwen-2.5, Llama3.1, and Gemma2.
Offers a wide range of RL algorithms including PPO, GRPO, ReMax, DAPO, and more, with support for vision-language models (VLMs).
Scales up to 70B models and hundreds of GPUs, with experiment tracking via wandb, swanlab, mlflow, and tensorboard.

Maintenance & Community

Initiated by ByteDance Seed team and maintained by the verl community.
Active development with regular releases and presentations at major conferences (e.g., EuroSys, NeurIPS).
Community channels available via Twitter (@verl_project).
Roadmap available at GitHub Issues.

Licensing & Compatibility

Apache 2.0 License.
Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

Megatron-LM backend support for AMD GPUs is noted as "coming soon."
Users are advised to avoid vLLM version 0.7.x due to known bugs.
The project is actively developed, and breaking changes may occur, with a discussion thread available for tracking them.

Health Check

Last Commit

11 hours ago

Responsiveness

1 day

Pull Requests (30d)

284

Issues (30d)

111

Star History

839 stars in the last 30 days