RL tuning framework for LLM agents, inspired by Deepseek-R1
Top 15.2% on sourcepulse
OpenManus-RL is an open-source initiative focused on advancing Reinforcement Learning (RL) tuning for Large Language Model (LLM) agents. It aims to improve agent reasoning, decision-making, and tool integration capabilities, targeting researchers and developers in the LLM agent space. The project offers a platform for live-streamed development, sharing progress, datasets, and tuned models for benchmarks like GAIA and AgentBench.
How It Works
The project employs an RL-based tuning framework, inspired by RAGEN's RICO, to enhance LLM agents. It explores novel algorithmic structures, diverse reasoning paradigms (e.g., ReAct, Outcome-based), and sophisticated reward strategies (e.g., format-based, outcome-based). The approach integrates various rollout strategies like Tree-of-Thoughts and Monte Carlo Tree Search, and post-training methods including SFT, GRPO, PPO, and DPO, aiming for robust agent performance across multiple benchmarks.
Quick Start & Requirements
pip install -e .
(after creating and activating a conda environment with Python 3.11).torch==2.4.0
), vllm (vllm==0.6.3
), flash-attn, wandb. Specific environments like WebShop require additional setup (environment.yml
, ./setup.sh
).Highlighted Details
Maintenance & Community
Collaborative effort between Ulab-UIUC and MetaGPT. Community contributions are welcomed via issues and pull requests. Contact: kunlunz2@illinois.edu.
Licensing & Compatibility
The project is hosted on GitHub and cites a custom MIT-like license for its own code. However, it integrates with other projects and datasets, whose licenses should be independently verified for commercial use or closed-source linking.
Limitations & Caveats
The README states that the library for SFT and GRPO tuning is still under development. The "Code and dataset coming soon!" note from the initial description suggests potential incompleteness or ongoing development of core components.
2 weeks ago
1 day