rl-swarm  by gensyn-ai

Open-source framework for decentralized RL training swarms

Created 6 months ago
1,400 stars

Top 28.9% on SourcePulse

GitHubView on GitHub
Project Summary

RL Swarm is an open-source, peer-to-peer framework for distributed reinforcement learning training. It enables users to collaboratively train large language models by connecting their hardware to a decentralized network, leveraging collective intelligence for faster and more efficient model development. The system is designed for both consumer laptops and cloud GPUs, offering permissionless participation.

How It Works

RL Swarm utilizes a peer-to-peer architecture where individual nodes contribute computational resources to train models collaboratively. It supports various Qwen 2.5 models and datasets, allowing users to select configurations based on their hardware capabilities. The system includes an optional on-chain identity management layer via Alchemy for tracking progress and participation.

Quick Start & Requirements

  • Install/Run: Execute ./run_rl_swarm.sh after setting up a Python virtual environment (python3 -m venv .venv, source .venv/bin/activate).
  • Prerequisites: Python >= 3.10.
  • Hardware:
    • Small models (0.5B/1.5B) + GSM8K: arm64/x86 CPU with 16GB RAM, or CUDA GPUs (RTX 3090/4090, A100, H100).
    • Big models (7B/32B/72B) + DAPO-Math 17K: Recommended A100 (80GB) or H100 (80GB).
  • Setup: Follow interactive prompts to select swarm, model, and log in via a browser window (port 3000).
  • Docs: Troubleshooting, Identity Management

Highlighted Details

  • Supports training of multiple Qwen 2.5 models, from 0.5B to 72B parameters.
  • Offers participation in specific swarms like "Math" (GSM8K) and "Math Hard" (DAPO-Math 17K).
  • Optional on-chain identity registration via Alchemy for progress tracking.
  • Experimental support for CPU training and Windows via WSL.

Maintenance & Community

  • The project is actively developed by Gensyn AI.
  • Community support is available via Discord (link not provided in README).
  • Troubleshooting and issue reporting are encouraged via GitHub Issues.

Licensing & Compatibility

  • The project is open source. The specific license is not explicitly stated in the README, but it is described as "permissionless."

Limitations & Caveats

This software is experimental and provided as-is. Performance on consumer hardware may be slow, and some configurations or platforms (e.g., certain VPSs, Windows without WSL) may require significant debugging. On-chain identity management has specific requirements regarding swarm.pem and email registration to function correctly.

Health Check
Last Commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)
5
Issues (30d)
14
Star History
120 stars in the last 30 days

Explore Similar Projects

Starred by Hanlin Tang Hanlin Tang(CTO Neural Networks at Databricks; Cofounder of MosaicML), Amanpreet Singh Amanpreet Singh(Cofounder of Contextual AI), and
2 more.

coach by IntelLabs

0%
2k
Reinforcement learning framework for experimentation (discontinued)
Created 8 years ago
Updated 2 years ago
Feedback? Help us improve.