Discover and explore top open-source AI tools and projects—updated daily.
redai-infraAsynchronous RL engine for scalable, omni-modal LLM post-training
New!
Top 82.4% on SourcePulse
Relax is a high-performance, asynchronous reinforcement learning engine designed for omni-modal post-training at scale, particularly for multimodal large language models. It addresses the need for a unified, service-oriented framework that decouples training and inference, enabling efficient end-to-end RL training across text, vision, and audio modalities. The framework targets AI researchers and engineers working with large-scale RL applications, offering significant benefits in scalability, flexibility, and throughput.
How It Works
Relax employs a six-layer service-oriented architecture built on Ray Serve, where each component (Actor, Rollout, Critic, etc.) is an independent deployment. This design facilitates native service-level elastic scheduling and fault recovery. Core to its operation is the TransferQueue system, which enables fully asynchronous execution across independent GPU clusters for different training phases (Rollout, Actor, Advantages). This decoupling, combined with Megatron-LM for training and SGLang for inference, allows for high throughput via streaming data exchange and configurable staleness, optimizing resource utilization and training speed.
Quick Start & Requirements
docker pull relaxrl/relax:latest). After launching a container with appropriate GPU and IPC/network settings, clone the repo and install locally (pip install -e .).Highlighted Details
ray_native) and cross-cluster (external) federation.Maintenance & Community
The project was open-sourced on April 15, 2026. Contributions are welcomed via a dedicated Contributing Guide. Specific community channels (e.g., Discord, Slack) or a public roadmap are not detailed in the README.
Licensing & Compatibility
This project is licensed under the Apache License 2.0. This license is permissive and generally compatible with commercial use and linking within closed-source projects.
Limitations & Caveats
The README does not explicitly state limitations such as alpha status or known bugs. The setup, particularly for multi-node and multi-GPU configurations, requires significant technical expertise with distributed systems (Ray) and deep learning frameworks. The asynchronous nature with configurable staleness may require careful tuning to balance on-policy accuracy with training throughput.
7 hours ago
Inactive
NVlabs
alibaba
NovaSky-AI
hiyouga
pytorch
CarperAI