rl-swarm by gensyn-ai

Open-source framework for decentralized RL training swarms

Created 10 months ago

1,696 stars

Top 24.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Georgios Konstantopoulos

CTO, General Partner at Paradigm

Project Summary

RL Swarm is an open-source, peer-to-peer framework for distributed reinforcement learning training. It enables users to collaboratively train large language models by connecting their hardware to a decentralized network, leveraging collective intelligence for faster and more efficient model development. The system is designed for both consumer laptops and cloud GPUs, offering permissionless participation.

How It Works

RL Swarm utilizes a peer-to-peer architecture where individual nodes contribute computational resources to train models collaboratively. It supports various Qwen 2.5 models and datasets, allowing users to select configurations based on their hardware capabilities. The system includes an optional on-chain identity management layer via Alchemy for tracking progress and participation.

Quick Start & Requirements

Install/Run: Execute ./run_rl_swarm.sh after setting up a Python virtual environment (python3 -m venv .venv, source .venv/bin/activate).
Prerequisites: Python >= 3.10.
Hardware:
- Small models (0.5B/1.5B) + GSM8K: arm64/x86 CPU with 16GB RAM, or CUDA GPUs (RTX 3090/4090, A100, H100).
- Big models (7B/32B/72B) + DAPO-Math 17K: Recommended A100 (80GB) or H100 (80GB).
Setup: Follow interactive prompts to select swarm, model, and log in via a browser window (port 3000).
Docs: Troubleshooting, Identity Management

Highlighted Details

Supports training of multiple Qwen 2.5 models, from 0.5B to 72B parameters.
Offers participation in specific swarms like "Math" (GSM8K) and "Math Hard" (DAPO-Math 17K).
Optional on-chain identity registration via Alchemy for progress tracking.
Experimental support for CPU training and Windows via WSL.

Maintenance & Community

The project is actively developed by Gensyn AI.
Community support is available via Discord (link not provided in README).
Troubleshooting and issue reporting are encouraged via GitHub Issues.

Licensing & Compatibility

The project is open source. The specific license is not explicitly stated in the README, but it is described as "permissionless."

Limitations & Caveats

This software is experimental and provided as-is. Performance on consumer hardware may be slow, and some configurations or platforms (e.g., certain VPSs, Windows without WSL) may require significant debugging. On-chain identity management has specific requirements regarding swarm.pem and email registration to function correctly.

rl-swarm by gensyn-ai

Explore Similar Projects

ISEK by isekOS

OpenDiloco by PrimeIntellect-ai

openrl by OpenRL-Lab

MADRL by sisl

prime-rl by PrimeIntellect-ai

MARLlib by Replicable-MARL

Federated-Learning by lokinko

atropos by NousResearch

openfederatedlearning by securefederatedai

coach by IntelLabs

onyx by onyx-dot-app

baselines by openai