prime-rl by PrimeIntellect-ai

Decentralized RL training codebase for scaling

Created 10 months ago

987 stars

Top 37.5% on SourcePulse

View on GitHub

6 Experts Love This Project

Johannes Hagemann

Cofounder of Prime Intellect

Travis Fischer

Founder of Agentic

Jeff Hammerbacher

Cofounder of Cloudera

Will Brown

Research Lead at Prime Intellect

and 2 more!

Project Summary

This repository provides a framework for decentralized Reinforcement Learning (RL) training at scale, targeting researchers and engineers working with large language models (LLMs) and complex RL tasks. It enables distributed training and inference across multiple nodes and GPUs, aiming to simplify and accelerate the development of advanced AI models.

How It Works

The project leverages a decentralized architecture, allowing training and inference processes to run independently and communicate across a distributed system. It utilizes torchrun for distributed training and vLLM for efficient inference, supporting various parallelization strategies like Tensor Parallelism (TP), Pipeline Parallelism (PP), and Data Parallelism (DP). This approach is designed to maximize hardware utilization and scalability for large-scale RL experiments.

Quick Start & Requirements

Install: curl -sSL https://raw.githubusercontent.com/PrimeIntellect-ai/prime-rl/main/install.sh | bash
Environment: Requires Python 3.10+, uv package manager. flash_attn must be installed and functional.
Hardware: Primarily designed for multi-GPU setups (NVIDIA GPUs with CUDA). Specific examples demonstrate configurations for 2, 4, and 8 GPUs, including multi-node setups.
Docs: [Not explicitly linked, but examples cover setup and distributed inference.]

Highlighted Details

Supports distributed inference with TP, PP, and DP, including combinations like TP+PP and TP+DP.
Provides detailed examples for running inference and training across various GPU configurations and multi-node setups.
Includes a citation for the "INTELLECT-2" model, trained using this framework.
Offers comprehensive test suites (unit, integration, GPU-specific, fast/slow tests).

Maintenance & Community

The project is associated with the "Prime Intellect Team" and lists several authors in its citation.
No explicit links to community channels (Discord, Slack) or a public roadmap are provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. The presence of a citation suggests academic use is intended. Commercial use implications are unclear without a specified license.

Limitations & Caveats

The README does not specify a license, which may impact commercial adoption.
Support for DP+PP configurations is explicitly stated as not supported.
The project appears to be in active development, with examples demonstrating specific model and dataset configurations.

Health Check

Last Commit

9 hours ago

Responsiveness

Inactive

Pull Requests (30d)

142

Issues (30d)

Star History

61 stars in the last 30 days