RL training framework for multi-modality models
Top 15.4% on sourcepulse
EasyR1 is an efficient and scalable reinforcement learning training framework designed for multi-modality vision-language models (VLMs). It targets researchers and engineers working with VLMs, offering a high-performance solution for tasks like fine-tuning and policy optimization, building upon the veRL project.
How It Works
EasyR1 leverages a HybridEngine design and vLLM's SPMD mode for efficiency and scalability. It supports various RL algorithms such as GRPO, Reinforce++, ReMax, and RLOO, and can process diverse text, vision-text, and multi-image-text datasets. Key features include padding-free training and robust logging integration with multiple platforms.
Quick Start & Requirements
pip install -e .
within the cloned repository. A Dockerfile is provided for environment setup.transformers>=4.51.0
, flash-attn>=2.4.3
, vllm>=0.8.3
. CUDA 12.6 and cuDNN are recommended via the provided Docker image.Highlighted Details
Maintenance & Community
The project is a fork of veRL and cites its core contributors. A WeChat group is available for discussion.
Licensing & Compatibility
The project does not explicitly state a license in the README. It is a fork of veRL, which is Apache 2.0 licensed. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
Vision language models are not yet compatible with ulysses parallelism. Support for LoRA is planned but not yet implemented. The project focuses solely on RL training and does not provide scripts for supervised fine-tuning or inference.
2 days ago
1 day