Agent-R1  by 0russwest0

RL framework for training LLM agents via end-to-end reinforcement learning

created 5 months ago
707 stars

Top 49.4% on sourcepulse

GitHubView on GitHub
Project Summary

Agent-R1 is an open-source framework for training large language model (LLM) agents using end-to-end reinforcement learning. It targets researchers and developers aiming to build autonomous agents by simplifying the process of defining domain-specific tools and reward functions, eliminating the need for complex workflow engineering. The framework enables agents to learn from complete interaction trajectories, coordinate multiple tools, and process both text and visual inputs.

How It Works

Agent-R1 employs end-to-end reinforcement learning to train agents on entire interaction sequences, allowing for learning from multi-turn tool usage and coordination. It supports custom tool integration via a base class and offers multiple RL algorithms like PPO, GRPO, and REINFORCE++. The framework also incorporates process rewards for individual tool calls, normalized against outcome rewards, and provides multi-modal support through integration with vision-language models.

Quick Start & Requirements

  • Install: git submodule update --init --recursive and reinstall verl locally.
  • Prerequisites: Python, compatible with vision-language models (VLMs). Specific hardware requirements are not detailed but likely include significant GPU resources for training.
  • Resources: Training requires substantial computational resources.
  • Links: Algorithm Doc, Awesome-Agent-RL, Extending Doc

Highlighted Details

  • End-to-end reinforcement learning on complete interaction trajectories.
  • Multi-tool coordination and process rewards for tool call effectiveness.
  • Multi-modal support with vision-language models (VLMs).
  • Benchmarking shows improved EM scores with increased tool calls, suggesting scaling laws related to interaction frequency.

Maintenance & Community

  • Active development with recent updates in March and April 2025.
  • Community engagement via WeChat group.
  • Contributors include student researchers from USTC.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

  • The README does not specify a license, which is a significant blocker for adoption without clarification.
  • While multi-modal support is added, the extent and ease of integration with various VLMs are not detailed.
  • The codebase was recently refactored, requiring users to reinitialize submodules and reinstall.
Health Check
Last commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
3
Star History
275 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Toran Bruce Richards Toran Bruce Richards(Founder of AutoGPT), and
2 more.

OS-Copilot by OS-Copilot

0.1%
2k
OS agent for automating daily tasks
created 1 year ago
updated 10 months ago
Feedback? Help us improve.