rl-swarm  by gensyn-ai

Open-source framework for decentralized RL training swarms

created 5 months ago
1,207 stars

Top 33.1% on sourcepulse

GitHubView on GitHub
Project Summary

RL Swarm is an open-source, peer-to-peer framework for distributed reinforcement learning training. It enables users to collaboratively train large language models by connecting their hardware to a decentralized network, leveraging collective intelligence for faster and more efficient model development. The system is designed for both consumer laptops and cloud GPUs, offering permissionless participation.

How It Works

RL Swarm utilizes a peer-to-peer architecture where individual nodes contribute computational resources to train models collaboratively. It supports various Qwen 2.5 models and datasets, allowing users to select configurations based on their hardware capabilities. The system includes an optional on-chain identity management layer via Alchemy for tracking progress and participation.

Quick Start & Requirements

  • Install/Run: Execute ./run_rl_swarm.sh after setting up a Python virtual environment (python3 -m venv .venv, source .venv/bin/activate).
  • Prerequisites: Python >= 3.10.
  • Hardware:
    • Small models (0.5B/1.5B) + GSM8K: arm64/x86 CPU with 16GB RAM, or CUDA GPUs (RTX 3090/4090, A100, H100).
    • Big models (7B/32B/72B) + DAPO-Math 17K: Recommended A100 (80GB) or H100 (80GB).
  • Setup: Follow interactive prompts to select swarm, model, and log in via a browser window (port 3000).
  • Docs: Troubleshooting, Identity Management

Highlighted Details

  • Supports training of multiple Qwen 2.5 models, from 0.5B to 72B parameters.
  • Offers participation in specific swarms like "Math" (GSM8K) and "Math Hard" (DAPO-Math 17K).
  • Optional on-chain identity registration via Alchemy for progress tracking.
  • Experimental support for CPU training and Windows via WSL.

Maintenance & Community

  • The project is actively developed by Gensyn AI.
  • Community support is available via Discord (link not provided in README).
  • Troubleshooting and issue reporting are encouraged via GitHub Issues.

Licensing & Compatibility

  • The project is open source. The specific license is not explicitly stated in the README, but it is described as "permissionless."

Limitations & Caveats

This software is experimental and provided as-is. Performance on consumer hardware may be slow, and some configurations or platforms (e.g., certain VPSs, Windows without WSL) may require significant debugging. On-chain identity management has specific requirements regarding swarm.pem and email registration to function correctly.

Health Check
Last commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
10
Issues (30d)
45
Star History
659 stars in the last 90 days

Explore Similar Projects

Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

petals by bigscience-workshop

0.1%
10k
Run LLMs at home, BitTorrent-style
created 3 years ago
updated 10 months ago
Starred by Anton Bukov Anton Bukov(Cofounder of 1inch Network), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
9 more.

exo by exo-explore

0.3%
29k
AI cluster for running models on diverse devices
created 1 year ago
updated 4 months ago
Feedback? Help us improve.