Agent-R1 by 0russwest0

RL framework for training LLM agents via end-to-end reinforcement learning

Created 10 months ago

1,149 stars

Top 33.5% on SourcePulse

View on GitHub

2 Experts Love This Project

Elvis Saravia

Founder of DAIR.AI

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

Agent-R1 is an open-source framework for training large language model (LLM) agents using end-to-end reinforcement learning. It targets researchers and developers aiming to build autonomous agents by simplifying the process of defining domain-specific tools and reward functions, eliminating the need for complex workflow engineering. The framework enables agents to learn from complete interaction trajectories, coordinate multiple tools, and process both text and visual inputs.

How It Works

Agent-R1 employs end-to-end reinforcement learning to train agents on entire interaction sequences, allowing for learning from multi-turn tool usage and coordination. It supports custom tool integration via a base class and offers multiple RL algorithms like PPO, GRPO, and REINFORCE++. The framework also incorporates process rewards for individual tool calls, normalized against outcome rewards, and provides multi-modal support through integration with vision-language models.

Quick Start & Requirements

Install: git submodule update --init --recursive and reinstall verl locally.
Prerequisites: Python, compatible with vision-language models (VLMs). Specific hardware requirements are not detailed but likely include significant GPU resources for training.
Resources: Training requires substantial computational resources.
Links: Algorithm Doc, Awesome-Agent-RL, Extending Doc

Highlighted Details

End-to-end reinforcement learning on complete interaction trajectories.
Multi-tool coordination and process rewards for tool call effectiveness.
Multi-modal support with vision-language models (VLMs).
Benchmarking shows improved EM scores with increased tool calls, suggesting scaling laws related to interaction frequency.

Maintenance & Community

Active development with recent updates in March and April 2025.
Community engagement via WeChat group.
Contributors include student researchers from USTC.

Licensing & Compatibility

The repository does not explicitly state a license in the README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

The README does not specify a license, which is a significant blocker for adoption without clarification.
While multi-modal support is added, the extent and ease of integration with various VLMs are not detailed.
The codebase was recently refactored, requiring users to reinitialize submodules and reinstall.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

121 stars in the last 30 days