OpenManus-RL  by OpenManus

RL tuning framework for LLM agents, inspired by Deepseek-R1

created 4 months ago
3,265 stars

Top 15.2% on sourcepulse

GitHubView on GitHub
Project Summary

OpenManus-RL is an open-source initiative focused on advancing Reinforcement Learning (RL) tuning for Large Language Model (LLM) agents. It aims to improve agent reasoning, decision-making, and tool integration capabilities, targeting researchers and developers in the LLM agent space. The project offers a platform for live-streamed development, sharing progress, datasets, and tuned models for benchmarks like GAIA and AgentBench.

How It Works

The project employs an RL-based tuning framework, inspired by RAGEN's RICO, to enhance LLM agents. It explores novel algorithmic structures, diverse reasoning paradigms (e.g., ReAct, Outcome-based), and sophisticated reward strategies (e.g., format-based, outcome-based). The approach integrates various rollout strategies like Tree-of-Thoughts and Monte Carlo Tree Search, and post-training methods including SFT, GRPO, PPO, and DPO, aiming for robust agent performance across multiple benchmarks.

Quick Start & Requirements

  • Installation: pip install -e . (after creating and activating a conda environment with Python 3.11).
  • Prerequisites: PyTorch with CUDA 12.1 (torch==2.4.0), vllm (vllm==0.6.3), flash-attn, wandb. Specific environments like WebShop require additional setup (environment.yml, ./setup.sh).
  • Resources: Requires GPU with CUDA 12.1 support. WebShop setup involves launching a server on port 36001.
  • Links: Huggingface Dataset, AgentGym, Verl.

Highlighted Details

  • Open-sources an Agent SFT dataset on Huggingface.
  • Combines datasets from AgentInstruct, Agent-FLAN, and AgentTraj-L for over 50,000 trajectories.
  • Supports multiple reasoning models (GPT-O1, Deepseek-R1, QwQ-32B) and rollout strategies (ToT, GoT, DFSDT, MCTS).
  • Integrates with RL tuning frameworks like Verl, TinyZero, OpenR1, and Trlx.

Maintenance & Community

Collaborative effort between Ulab-UIUC and MetaGPT. Community contributions are welcomed via issues and pull requests. Contact: kunlunz2@illinois.edu.

Licensing & Compatibility

The project is hosted on GitHub and cites a custom MIT-like license for its own code. However, it integrates with other projects and datasets, whose licenses should be independently verified for commercial use or closed-source linking.

Limitations & Caveats

The README states that the library for SFT and GRPO tuning is still under development. The "Code and dataset coming soon!" note from the initial description suggests potential incompleteness or ongoing development of core components.

Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
2
Star History
684 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.