AgentGym-RL by WooooDyy

Train LLM agents for long-horizon, multi-turn decision-making

Created 6 months ago

633 stars

Top 52.4% on SourcePulse

View on GitHub

2 Experts Love This Project

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Wing Lian

Founder of Axolotl AI

Project Summary

AgentGym-RL provides a framework for training LLM agents in multi-turn, long-horizon decision-making tasks using Reinforcement Learning. It targets researchers and developers aiming to enhance LLM agent capabilities, offering substantial performance gains that match or surpass commercial models across diverse real-world scenarios. [1, 2]

How It Works

The framework features a modular design (Environment, Agent, Training). Environments offer diverse scenarios (WebArena, Deep Search, TextCraft, BabyAI, SciWorld) via HTTP. The Agent module handles reasoning, and the Training module supports RL algorithms (PPO, GRPO, REINFORCE++) and other methods (SFT, DPO). A key innovation, ScalingInter-RL, progressively scales interaction horizons during training. This method balances exploration/exploitation, enabling stable and efficient optimization for long-horizon tasks.

Quick Start & Requirements

Installation involves pip install -e . for the core library and pip install transformers==4.51.3. Recommended environment: CUDA 12.4, PyTorch 2.4, Python 3.10. Specific installation steps for flash-attn are provided. Users must also download the AgentGym-RL-Data-ID dataset from Hugging Face and launch a separate environment server. Links to the paper [1], project page [2], and dataset [3] are available.

Highlighted Details

Qwen2.5-7B models trained with AgentGym-RL and ScalingInter-RL match or surpass top proprietary models (GPT-4o, Gemini-2.5-Pro) on 27 tasks. The framework supports diverse environments including WebArena, Deep Search, TextCraft, BabyAI, and SciWorld. It integrates multiple RL algorithms and provides a visualized user interface for trajectory analysis.

Maintenance & Community

The README acknowledges underlying projects but offers no specific details on current maintenance status, active contributors, or community channels like Discord or Slack.

Licensing & Compatibility

The provided README does not specify the software license, leaving commercial use or closed-source integration compatibility unclear.

Limitations & Caveats

Setup requires specific, potentially recent versions of CUDA, PyTorch, and Python. Reliance on external environment servers and the complexity of the training pipeline may present adoption challenges.

[1] https://arxiv.org/abs/2509.08755 [2] https://github.com/WoooDyy/AgentGym-RL [3] https://huggingface.co/datasets/AgentGym-RL-Data-ID

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

43 stars in the last 30 days