prime-rl  by PrimeIntellect-ai

Decentralized RL training codebase for scaling

created 5 months ago
396 stars

Top 74.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a framework for decentralized Reinforcement Learning (RL) training at scale, targeting researchers and engineers working with large language models (LLMs) and complex RL tasks. It enables distributed training and inference across multiple nodes and GPUs, aiming to simplify and accelerate the development of advanced AI models.

How It Works

The project leverages a decentralized architecture, allowing training and inference processes to run independently and communicate across a distributed system. It utilizes torchrun for distributed training and vLLM for efficient inference, supporting various parallelization strategies like Tensor Parallelism (TP), Pipeline Parallelism (PP), and Data Parallelism (DP). This approach is designed to maximize hardware utilization and scalability for large-scale RL experiments.

Quick Start & Requirements

  • Install: curl -sSL https://raw.githubusercontent.com/PrimeIntellect-ai/prime-rl/main/install.sh | bash
  • Environment: Requires Python 3.10+, uv package manager. flash_attn must be installed and functional.
  • Hardware: Primarily designed for multi-GPU setups (NVIDIA GPUs with CUDA). Specific examples demonstrate configurations for 2, 4, and 8 GPUs, including multi-node setups.
  • Docs: [Not explicitly linked, but examples cover setup and distributed inference.]

Highlighted Details

  • Supports distributed inference with TP, PP, and DP, including combinations like TP+PP and TP+DP.
  • Provides detailed examples for running inference and training across various GPU configurations and multi-node setups.
  • Includes a citation for the "INTELLECT-2" model, trained using this framework.
  • Offers comprehensive test suites (unit, integration, GPU-specific, fast/slow tests).

Maintenance & Community

  • The project is associated with the "Prime Intellect Team" and lists several authors in its citation.
  • No explicit links to community channels (Discord, Slack) or a public roadmap are provided in the README.

Licensing & Compatibility

  • The README does not explicitly state a license. The presence of a citation suggests academic use is intended. Commercial use implications are unclear without a specified license.

Limitations & Caveats

  • The README does not specify a license, which may impact commercial adoption.
  • Support for DP+PP configurations is explicitly stated as not supported.
  • The project appears to be in active development, with examples demonstrating specific model and dataset configurations.
Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
150
Issues (30d)
1
Star History
326 stars in the last 90 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
12 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 1 day ago
Feedback? Help us improve.