SkyRL  by NovaSky-AI

RL training pipeline for multi-turn tool use LLMs, optimized for real-world tasks

created 3 months ago
666 stars

Top 51.5% on sourcepulse

GitHubView on GitHub
Project Summary

SkyRL-v0 provides an open reinforcement learning (RL) training pipeline for multi-turn tool-use large language models (LLMs), optimized for long-horizon, real-environment tasks. It targets researchers and developers working on complex agentic systems, offering a framework to improve LLM performance on tasks like software engineering and text-to-SQL.

How It Works

SkyRL-v0 is a fork of the VeRL framework, leveraging its asynchronous rollout capabilities for efficient training. This approach is designed to handle the complexities of long-horizon tasks by enabling agents to interact with real environments over extended periods. The integration of SGLang's async rollout feature is key to its performance, allowing for parallelized interaction and data collection, which is crucial for RL training.

Quick Start & Requirements

  • Installation: Clone the repository with submodules: git clone --recurse-submodules https://github.com/NovaSky-AI/SkyRL
  • Prerequisites: Refer to INSTALL.md for detailed instructions.
  • Resources: Training examples indicate requirements for multiple high-end GPUs (e.g., 8xH100 or 8xH200).
  • Links: Getting Started, SkyRL-SQL Blog Post, SkyRL-v0 Blog Post

Highlighted Details

  • SkyRL-Agent-14B-v0 achieves 21.6% on SWE-Bench-Verified, a 3.6% improvement over its base model.
  • SkyRL-SQL-7B outperforms GPT-4o and o4-mini on Spider benchmarks by up to 9.2% with multi-turn RL.
  • Training for SkyRL-Agent-14B-v0 took 20 hours on 8xH200 GPUs.
  • SkyRL-SQL-7B was trained on only 653 samples.

Maintenance & Community

  • The project is associated with Berkeley Sky Computing Lab.
  • Compute support from Anyscale, Databricks, NVIDIA, Lambda Labs, and AMD.
  • Key contributors from Tsinghua University and OpenBMB/ModelBest are acknowledged for SGLang integration.
  • Community links include Website, X, and Discord.

Licensing & Compatibility

  • The repository's license is not explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The codebase is a fork of VeRL and is undergoing refactoring to align with the VeRL main branch. Specific hardware requirements (multiple high-end GPUs) may present an adoption barrier. The licensing status requires clarification for commercial applications.

Health Check
Last commit

3 days ago

Responsiveness

Inactive

Pull Requests (30d)
50
Issues (30d)
20
Star History
684 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Shishir Patil Shishir Patil(Author of BFCL, Gorilla).

SkyThought by NovaSky-AI

0.2%
3k
Training recipes for Sky-T1 family of models
created 6 months ago
updated 3 weeks ago
Starred by Lewis Tunstall Lewis Tunstall(Researcher at Hugging Face), Robert Nishihara Robert Nishihara(Cofounder of Anyscale; Author of Ray), and
4 more.

verl by volcengine

2.4%
12k
RL training library for LLMs
created 9 months ago
updated 18 hours ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.