RLinf  by RLinf

Reinforcement learning infrastructure for agentic AI

Created 5 months ago
2,067 stars

Top 21.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

RLinf is an open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) using reinforcement learning. It provides a flexible and scalable backbone for developing agentic AI, enabling open-ended learning and continuous generalization. The system is particularly beneficial for researchers and developers working on advanced AI training paradigms.

How It Works

RLinf introduces a novel "Macro-to-Micro Flow" (M2Flow) paradigm, which separates the logical workflow construction from physical communication and scheduling. This allows for programmable, high-level logical flows to be executed efficiently through micro-level operations. It supports flexible execution modes (Collocated, Disaggregated, Hybrid) and an automatic scheduling strategy that selects the optimal mode based on the training workload, eliminating the need for manual resource allocation.

Quick Start & Requirements

  • Installation: Details for installation are available in the README, with specific quickstart guides for PPO training of VLAs on Maniskill3 and GRPO training of LLMs on MATH.
  • Prerequisites: Supports FSDP + Hugging Face backends for rapid adaptation and Megatron + SGLang for large-scale training. Compatibility with mainstream CPU & GPU-based simulators like ManiSkill3 and LIBERO is provided.
  • Resources: Claims 120%+ throughput improvement with its hybrid mode and fine-grained pipelining. Automatic online scaling can improve efficiency by 20-40%.

Highlighted Details

  • Supports fast adaptation for VLA models like OpenVLA and π₀.
  • Enables RL fine-tuning of the π₀ model family with a flow-matching action expert.
  • Offers built-in support for popular RL methods including PPO, GRPO, DAPO, and Reinforce++.
  • Integrates LoRA for efficient fine-tuning and supports 5D Parallelism for Megatron-LM.

Maintenance & Community

RLinf is a new project, with its formal v0.1 release and accompanying paper expected soon. It acknowledges inspiration from projects like VeRL, AReaL, Megatron-LM, SGLang, and PyTorch FSDP. Contact information for inquiries and potential collaborators is provided.

Licensing & Compatibility

The README does not explicitly state the license type or compatibility for commercial use.

Limitations & Caveats

The project is in its early stages, with a formal v0.1 release and paper forthcoming. The roadmap indicates planned support for heterogeneous GPUs, asynchronous pipeline execution, Mixture of Experts (MoE), vLLM inference backend, and various VLM/VLA training extensions, suggesting these features are not yet available.

Health Check
Last Commit

18 hours ago

Responsiveness

Inactive

Pull Requests (30d)
103
Issues (30d)
42
Star History
393 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Huber Jeff Huber(Cofounder of Chroma), Omar Khattab Omar Khattab(Coauthor of DSPy, ColBERT; Professor at MIT), and
1 more.

arbor by Ziems

0%
302
Framework for optimizing DSPy programs with RL
Created 10 months ago
Updated 3 days ago
Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

fms-fsdp by foundation-model-stack

0.7%
278
Efficiently train foundation models with PyTorch
Created 1 year ago
Updated 1 month ago
Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI) and Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI).

Pai-Megatron-Patch by alibaba

0.7%
2k
Training toolkit for LLMs & VLMs using Megatron
Created 2 years ago
Updated 3 weeks ago
Feedback? Help us improve.