ROLL  by alibaba

RL library for large language models

created 2 months ago
1,591 stars

Top 26.8% on sourcepulse

GitHubView on GitHub
Project Summary

ROLL is an open-source library designed to scale Reinforcement Learning (RL) for Large Language Models (LLMs) using distributed GPU resources. It targets AI labs, hyperscalers, and product developers aiming to enhance LLM capabilities in areas like human preference alignment, complex reasoning, and agentic interactions, offering significant speedups and cost reductions.

How It Works

ROLL employs a multi-role distributed architecture, leveraging Ray for flexible resource allocation and heterogeneous task scheduling. It integrates with high-performance backends such as Megatron-Core, SGLang, and vLLM to accelerate training and inference. The library emphasizes efficient data handling, including sample filtering based on difficulty and length, and provides advanced techniques for stabilizing training, such as value/advantage clipping and reward normalization.

Quick Start & Requirements

  • Installation: pip install -e . (from source)
  • Prerequisites: Python 3.10+, PyTorch, Ray, Megatron-Core, SGLang, vLLM. Specific hardware requirements depend on the LLM size and scale of training, but large-scale GPU clusters are implied for optimal use.
  • Resources: Setup time and resource footprint are highly variable based on the scale of LLM and GPU cluster.
  • Documentation: Quick Start, Installation, RLVR Pipeline, Agentic RL Pipeline.

Highlighted Details

  • Supports LLMs up to 200B+ parameters across thousands of GPUs with fault tolerance.
  • Offers flexible hardware usage with colocation/disaggregation and sync/async execution modes.
  • Features a compositional sample-reward route for dynamic task routing and custom reward/environment workers.
  • Includes advanced RL tuning techniques like dual clip loss, advantage whitening, and token-level KL regularization.

Maintenance & Community

Developed by Alibaba TAOBAO & TMALL Group and Alibaba Group. The project actively posts updates and has a tech report available. Community contributions are welcomed.

Licensing & Compatibility

Licensed under the Apache License (Version 2.0). The project utilizes third-party components under other open-source licenses, as detailed in the NOTICE file. Compatible with commercial use.

Limitations & Caveats

The project is actively under development with upcoming features like Qwen2.5 VL RL pipeline and FSDP2 integration. While it supports single-GPU setups, its primary design focus is on large-scale GPU clusters.

Health Check
Last commit

20 hours ago

Responsiveness

Inactive

Pull Requests (30d)
14
Issues (30d)
31
Star History
1,613 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

InternEvo by InternLM

1.0%
402
Lightweight training framework for model pre-training
created 1 year ago
updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Philipp Schmid Philipp Schmid(DevRel at Google DeepMind), and
2 more.

LightLLM by ModelTC

0.7%
3k
Python framework for LLM inference and serving
created 2 years ago
updated 15 hours ago
Starred by Lewis Tunstall Lewis Tunstall(Researcher at Hugging Face), Robert Nishihara Robert Nishihara(Cofounder of Anyscale; Author of Ray), and
4 more.

verl by volcengine

2.4%
12k
RL training library for LLMs
created 9 months ago
updated 14 hours ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Anton Bukov Anton Bukov(Cofounder of 1inch Network), and
16 more.

tinygrad by tinygrad

0.1%
30k
Minimalist deep learning framework for education and exploration
created 4 years ago
updated 18 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Tobi Lutke Tobi Lutke(Cofounder of Shopify), and
27 more.

vllm by vllm-project

1.0%
54k
LLM serving engine for high-throughput, memory-efficient inference
created 2 years ago
updated 14 hours ago
Feedback? Help us improve.