Tree-GRPO by AMAP-ML

LLM agent reinforcement learning with tree search

Created 5 months ago

289 stars

Top 91.2% on SourcePulse

Project Summary

Summary

Tree-GRPO introduces a novel tree-search rollout strategy for LLM agent RL, enhancing efficiency over chain-based methods. It enables more effective learning and decision-making with reduced computational budgets, targeting researchers and engineers developing advanced LLM agents. The approach offers superior performance and faster training cycles by optimizing exploration and supervision signals.

How It Works

Tree-GRPO constructs a search tree from ReAct step-level nodes, facilitating rollout sampling over this semantically structured tree. This contrasts with independent, chain-based rollouts, allowing more efficient state-action space exploration and providing a richer, tree-based supervision signal. The core advantage is achieving comparable or superior performance with a fraction of the rollout budget.

Quick Start & Requirements

Requires separate Conda environments: treegrpo (Python 3.12.9) and retriever (Python 3.10.13).
Key dependencies: PyTorch (2.6.0), vLLM (0.8.5.post1), Flash Attention 2, FAISS-GPU (1.7.3), Transformers, Datasets, Pyserini, FastAPI, Uvicorn.
Involves dataset download/processing and launching retrieval servers or Bing API integration.
Launch scripts (train_multihopqa_grpo.sh, train_multihopqa_tree_search.sh) provided.
Logs tracked via Swanlab.
Links: [arXiv](

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

16 stars in the last 30 days

Explore Similar Projects

awesome-in-context-rl by dunnolab

Advancing reinforcement learning through in-context learning paradigms

Created 1 year ago

Updated 5 months ago

saplings by shobrook

Reasoning library for agentic tree search & tool use

Created 2 years ago

Updated 7 months ago

Starred by

Wing Lian

Wing Lian(Founder of Axolotl AI).

Open-AgentRL by Gen-Verse

Reinforcement learning for LLM agents

Created 4 months ago

Updated 3 weeks ago

Awesome-RL-based-LLM-Reasoning by bruno686

Resource list for RL-based LLM reasoning

Created 1 year ago

Updated 7 months ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory).

ARPO by RUC-NLPIR

Agentic RL for LLM tool use

Created 7 months ago

Updated 4 weeks ago

Agentic-RAG-R1 by jiangxinke

Agentic RAG framework enhanced with reinforcement learning

Created 11 months ago

Updated 1 week ago

Starred by

Wing Lian

Wing Lian(Founder of Axolotl AI),

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect), and

3 more.

LlamaGym by KhoomeiK

SDK for fine-tuning LLM agents with online reinforcement learning

Created 2 years ago

Updated 1 year ago

Starred by

Yiran Wu

Yiran Wu(Coauthor of AutoGen) and

Elie Bursztein

Elie Bursztein(Cybersecurity Lead at Google DeepMind).

Awesome-AgenticLLM-RL-Papers by xhyumiracle

Surveying the landscape of agentic reinforcement learning for LLMs

Created 6 months ago

Updated 1 month ago

Starred by

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA).

verl-tool by TIGER-AI-Lab

Framework for training tool-using LLM agents

Created 11 months ago

Updated 6 days ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory).

Awesome-LLM-Post-training by mbzuai-oryx

Curated list of LLM post-training resources

Created 1 year ago

Updated 4 months ago

Starred by

Yineng Zhang

Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

RL-Factory by Simple-Efficient

RL post-training framework for agentic learning

Created 9 months ago

Updated 2 months ago

Starred by

Pawel Garbacki

Pawel Garbacki(Cofounder of Fireworks AI),

Dan Guido

Dan Guido(Cofounder of Trail of Bits), and

4 more.

agent-lightning by microsoft

Train any AI agent with rollouts and feedback

Created 8 months ago

Updated 2 weeks ago

Feedback? Help us improve.