SkyRL by NovaSky-AI

RL training pipeline for multi-turn tool use LLMs, optimized for real-world tasks

Created 8 months ago

1,438 stars

Top 28.1% on SourcePulse

View on GitHub

17 Experts Love This Project

Lewis Tunstall

Research Engineer at Hugging Face

Yaowei Zheng

Author of LLaMA-Factory

and 13 more!

Project Summary

SkyRL-v0 provides an open reinforcement learning (RL) training pipeline for multi-turn tool-use large language models (LLMs), optimized for long-horizon, real-environment tasks. It targets researchers and developers working on complex agentic systems, offering a framework to improve LLM performance on tasks like software engineering and text-to-SQL.

How It Works

SkyRL-v0 is a fork of the VeRL framework, leveraging its asynchronous rollout capabilities for efficient training. This approach is designed to handle the complexities of long-horizon tasks by enabling agents to interact with real environments over extended periods. The integration of SGLang's async rollout feature is key to its performance, allowing for parallelized interaction and data collection, which is crucial for RL training.

Quick Start & Requirements

Installation: Clone the repository with submodules: git clone --recurse-submodules https://github.com/NovaSky-AI/SkyRL
Prerequisites: Refer to INSTALL.md for detailed instructions.
Resources: Training examples indicate requirements for multiple high-end GPUs (e.g., 8xH100 or 8xH200).
Links: Getting Started, SkyRL-SQL Blog Post, SkyRL-v0 Blog Post

Highlighted Details

SkyRL-Agent-14B-v0 achieves 21.6% on SWE-Bench-Verified, a 3.6% improvement over its base model.
SkyRL-SQL-7B outperforms GPT-4o and o4-mini on Spider benchmarks by up to 9.2% with multi-turn RL.
Training for SkyRL-Agent-14B-v0 took 20 hours on 8xH200 GPUs.
SkyRL-SQL-7B was trained on only 653 samples.

Maintenance & Community

The project is associated with Berkeley Sky Computing Lab.
Compute support from Anyscale, Databricks, NVIDIA, Lambda Labs, and AMD.
Key contributors from Tsinghua University and OpenBMB/ModelBest are acknowledged for SGLang integration.
Community links include Website, X, and Discord.

Licensing & Compatibility

The repository's license is not explicitly stated in the README.
Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The codebase is a fork of VeRL and is undergoing refactoring to align with the VeRL main branch. Specific hardware requirements (multiple high-end GPUs) may present an adoption barrier. The licensing status requires clarification for commercial applications.

Health Check

Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

89 stars in the last 30 days