Tool-Star by RUC-NLPIR

LLM multi-tool reasoning powered by reinforcement learning

Created 9 months ago

319 stars

Top 85.2% on SourcePulse

Project Summary

Tool-Star is a reinforcement learning framework enabling LLMs to autonomously invoke multiple external tools for complex reasoning. It targets researchers and developers seeking to enhance LLM capabilities in computational and knowledge-intensive tasks, offering improved efficiency and reliability in tool usage.

How It Works

This framework integrates six tool types, employing systematic data synthesis and training algorithms. It leverages reinforcement learning, referencing frameworks like ReCall and VERL, to train LLMs for stepwise reasoning and tool invocation. The approach aims for autonomous, efficient, and reliable tool use, with recent advancements like ARPO accelerating training significantly.

Quick Start & Requirements

Installation: Requires setting up the LLaMA Factory repository for SFT and a separate Tool_Star_RL environment for RL training.
Prerequisites: Python 3.9+ (3.10 recommended for RL). Key dependencies include PyTorch (v2.4.0 with CUDA 12.4 recommended), FlashAttention, and specific versions of vLLM (vLLM <= 0.6.3 is noted as potentially incompatible). API keys for web search (e.g., Bing) are necessary. Evaluation requires VLLM and FlashRAG.
Setup: Involves cloning repositories, installing Python packages, configuring API keys, and downloading datasets/models.
Links: LLaMA Factory: https://github.com/hiyouga/LLaMA-Factory.git. Paper: https://arxiv.org/abs/2505.16410.

Highlighted Details

Achieves strong reasoning performance on over 10 computational and knowledge-intensive tasks (e.g., AIME24, MATH500, WebWalker).
Supports multiple LLM checkpoints based on Qwen2.5 series (0.5B to 7B parameters).
Introduces ARPO, a new project that accelerates Tool-Star training by approximately 4x.
Integrates six distinct tool types for enhanced reasoning.

Maintenance & Community

The project is actively maintained with recent updates in July 2025, including the release of the ARPO training accelerator. It welcomes community contributions. Contact is available via email at dongguanting@ruc.edu.cn.

Licensing & Compatibility

Released under the MIT License, which is permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The framework is noted as still under development with room for improvement. Specific version incompatibilities exist for vLLM and PyTorch. Setting up the RL and evaluation environments requires careful attention to dependency versions and API key configurations.

Tool-Star by RUC-NLPIR

Explore Similar Projects

ToRL by GAIR-NLP

Awesome-Agent-RL by 0russwest0

SimpleTIR by ltzheng

InternBootcamp by InternLM

Husky-v1 by agent-husky

POLARIS by ChenxinAn-fdu

R-Zero by Chengsong-Huang

M_GRPO by baibizhe

Agent-R1 by AgentR1

train-deepseek-r1 by FareedKhan-dev

Awesome-LLM-Post-training by mbzuai-oryx

octotools by octotools