Relax  by redai-infra

Asynchronous RL engine for scalable, omni-modal LLM post-training

Created 2 weeks ago

New!

333 stars

Top 82.4% on SourcePulse

GitHubView on GitHub
Project Summary

Relax is a high-performance, asynchronous reinforcement learning engine designed for omni-modal post-training at scale, particularly for multimodal large language models. It addresses the need for a unified, service-oriented framework that decouples training and inference, enabling efficient end-to-end RL training across text, vision, and audio modalities. The framework targets AI researchers and engineers working with large-scale RL applications, offering significant benefits in scalability, flexibility, and throughput.

How It Works

Relax employs a six-layer service-oriented architecture built on Ray Serve, where each component (Actor, Rollout, Critic, etc.) is an independent deployment. This design facilitates native service-level elastic scheduling and fault recovery. Core to its operation is the TransferQueue system, which enables fully asynchronous execution across independent GPU clusters for different training phases (Rollout, Actor, Advantages). This decoupling, combined with Megatron-LM for training and SGLang for inference, allows for high throughput via streaming data exchange and configurable staleness, optimizing resource utilization and training speed.

Quick Start & Requirements

  • Primary install / run command: Recommended installation is via the official Docker image (docker pull relaxrl/relax:latest). After launching a container with appropriate GPU and IPC/network settings, clone the repo and install locally (pip install -e .).
  • Non-default prerequisites and dependencies: Requires GPUs with CUDA support. The Docker image bundles necessary versions of CUDA, PyTorch, Megatron-LM, SGLang, and Ray. Multi-node setup requires careful configuration of Ray and distributed storage.
  • Links:

Highlighted Details

  • Full Omni-Modal Training: Supports text, vision, and audio RL within a single framework, enabling end-to-end training for models like Qwen3-Omni.
  • Fully Async via TransferQueue: Rollout, Actor, and other components run on independent GPU clusters, exchanging data via streaming and configurable staleness for maximum throughput.
  • Agentic RL: Provides first-class support for multi-turn, closed-loop training with VLM multimodal context carry-over and flexible termination conditions.
  • Elastic Rollout Scaling: Dynamically scales inference engines mid-training via HTTP REST APIs, supporting both intra-cluster (ray_native) and cross-cluster (external) federation.
  • Megatron + SGLang Backends: Utilizes Megatron-LM for distributed training (TP/PP/CP/EP) and SGLang for high-throughput inference, with automatic weight conversion from HuggingFace.

Maintenance & Community

The project was open-sourced on April 15, 2026. Contributions are welcomed via a dedicated Contributing Guide. Specific community channels (e.g., Discord, Slack) or a public roadmap are not detailed in the README.

Licensing & Compatibility

This project is licensed under the Apache License 2.0. This license is permissive and generally compatible with commercial use and linking within closed-source projects.

Limitations & Caveats

The README does not explicitly state limitations such as alpha status or known bugs. The setup, particularly for multi-node and multi-GPU configurations, requires significant technical expertise with distributed systems (Ray) and deep learning frameworks. The asynchronous nature with configurable staleness may require careful tuning to balance on-policy accuracy with training throughput.

Health Check
Last Commit

7 hours ago

Responsiveness

Inactive

Pull Requests (30d)
7
Issues (30d)
13
Star History
335 stars in the last 15 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wing Lian Wing Lian(Founder of Axolotl AI), and
3 more.

ROLL by alibaba

0.5%
3k
RL library for large language models
Created 11 months ago
Updated 12 hours ago
Starred by Evan Hubinger Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
1 more.

rl by pytorch

0.2%
3k
PyTorch library for reinforcement learning research
Created 4 years ago
Updated 6 hours ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
19 more.

trlx by CarperAI

0.0%
5k
Distributed RLHF for LLMs
Created 3 years ago
Updated 2 years ago
Feedback? Help us improve.