Awesome-AgenticLLM-RL-Papers by xhyumiracle

Surveying the landscape of agentic reinforcement learning for LLMs

Created 6 months ago

1,568 stars

Top 26.2% on SourcePulse

View on GitHub

2 Experts Love This Project

Yiran Wu

Coauthor of AutoGen

Elie Bursztein

Cybersecurity Lead at Google DeepMind

Project Summary

This repository curates research papers on Agentic Reinforcement Learning (RL) for Large Language Models (LLMs), serving as a comprehensive survey. It categorizes advancements across key agent tasks like search, coding, and mathematics, offering researchers and practitioners a structured overview of LLM-agent RL techniques and their benefits.

How It Works

As a survey, this repository organizes and presents findings from academic literature on LLM agents enhanced by RL. It details various RL algorithms (PPO, DPO, GRPO families), their mechanisms, and objectives, categorized by task domains (Search, Code, Math, GUI, etc.). It also lists relevant environments and frameworks, providing links to the original research.

Quick Start & Requirements

This repository is a curated list of research resources, not a runnable project. Users should refer to the linked survey paper (https://arxiv.org/abs/2509.02547) and individual papers for setup and execution details of specific agentic LLM RL implementations.

Highlighted Details

Task Domains: Comprehensive coverage of agentic LLM RL across Search, Code, Math, GUI, Vision, Embodied, and Multi-Agent Systems.
Algorithm Families: Detailed insights into PPO, DPO, GRPO, and other RL algorithms, including their mechanisms and objectives.
Resource Catalog: Extensive lists of relevant environments, benchmarks, and RL frameworks with direct links to papers and code.

Maintenance & Community

The provided README content does not contain information on maintainers, contributors, community channels, sponsorships, or a project roadmap.

Licensing & Compatibility

No license information is specified in the README. Licensing pertains to the individual research papers and projects linked within the survey.

Limitations & Caveats

This survey represents a snapshot of agentic LLM RL research. Some sections are marked "TO BE ADDED," indicating incompleteness. The field's rapid evolution means new techniques emerge frequently, and this survey may not capture the latest advancements.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

91 stars in the last 30 days