RL training pipeline for multi-turn tool use LLMs, optimized for real-world tasks
Top 51.5% on sourcepulse
SkyRL-v0 provides an open reinforcement learning (RL) training pipeline for multi-turn tool-use large language models (LLMs), optimized for long-horizon, real-environment tasks. It targets researchers and developers working on complex agentic systems, offering a framework to improve LLM performance on tasks like software engineering and text-to-SQL.
How It Works
SkyRL-v0 is a fork of the VeRL framework, leveraging its asynchronous rollout capabilities for efficient training. This approach is designed to handle the complexities of long-horizon tasks by enabling agents to interact with real environments over extended periods. The integration of SGLang's async rollout feature is key to its performance, allowing for parallelized interaction and data collection, which is crucial for RL training.
Quick Start & Requirements
git clone --recurse-submodules https://github.com/NovaSky-AI/SkyRL
INSTALL.md
for detailed instructions.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The codebase is a fork of VeRL and is undergoing refactoring to align with the VeRL main branch. Specific hardware requirements (multiple high-end GPUs) may present an adoption barrier. The licensing status requires clarification for commercial applications.
3 days ago
Inactive