RL post-training framework for agentic learning
Top 31.1% on sourcepulse
RL-Factory is an open-source framework designed for efficient Reinforcement Learning (RL) post-training of agent models, particularly for tool-use scenarios. It targets researchers and developers aiming to simplify the process of training agents that interact with tools, offering a 2x speedup in training time and a focus on ease of use through decoupled environment and reward function definitions.
How It Works
RL-Factory decouples the environment from the RL training process, allowing users to define tool configurations and reward functions with minimal code. It supports asynchronous, parallel tool-calling for enhanced training efficiency and features multi-turn tool-calling, model-based rewards, and native support for training with Qwen3 models and MCP tools. This architecture aims to let users focus on reward logic and tool setup while enabling hardcore developers to optimize training efficiency and model performance.
Quick Start & Requirements
pip3 install -e . --no-deps
(after installing other dependencies).accelerate
, bitsandbytes
, deepspeed
, flash-attn
, peft
, ray
, torch
, transformers
, vllm
, qwen-agent
, llama_index
, faiss-gpu-cu12
(optional), nvidia-cublas-cu12
(optional).Highlighted Details
Maintenance & Community
verl
, Search-R1
, and Qwen-Agent
.Licensing & Compatibility
Limitations & Caveats
Currently, only Qwen models have been tested. The framework is under rapid development, with a WebUI and broader model support planned for future releases. Optional dependencies like faiss-gpu-cu12
may be needed to resolve specific training issues.
3 days ago
Inactive