Tool-Star  by RUC-NLPIR

LLM multi-tool reasoning powered by reinforcement learning

Created 4 months ago
256 stars

Top 98.7% on SourcePulse

GitHubView on GitHub
Project Summary

Tool-Star is a reinforcement learning framework enabling LLMs to autonomously invoke multiple external tools for complex reasoning. It targets researchers and developers seeking to enhance LLM capabilities in computational and knowledge-intensive tasks, offering improved efficiency and reliability in tool usage.

How It Works

This framework integrates six tool types, employing systematic data synthesis and training algorithms. It leverages reinforcement learning, referencing frameworks like ReCall and VERL, to train LLMs for stepwise reasoning and tool invocation. The approach aims for autonomous, efficient, and reliable tool use, with recent advancements like ARPO accelerating training significantly.

Quick Start & Requirements

  • Installation: Requires setting up the LLaMA Factory repository for SFT and a separate Tool_Star_RL environment for RL training.
  • Prerequisites: Python 3.9+ (3.10 recommended for RL). Key dependencies include PyTorch (v2.4.0 with CUDA 12.4 recommended), FlashAttention, and specific versions of vLLM (vLLM <= 0.6.3 is noted as potentially incompatible). API keys for web search (e.g., Bing) are necessary. Evaluation requires VLLM and FlashRAG.
  • Setup: Involves cloning repositories, installing Python packages, configuring API keys, and downloading datasets/models.
  • Links: LLaMA Factory: https://github.com/hiyouga/LLaMA-Factory.git. Paper: https://arxiv.org/abs/2505.16410.

Highlighted Details

  • Achieves strong reasoning performance on over 10 computational and knowledge-intensive tasks (e.g., AIME24, MATH500, WebWalker).
  • Supports multiple LLM checkpoints based on Qwen2.5 series (0.5B to 7B parameters).
  • Introduces ARPO, a new project that accelerates Tool-Star training by approximately 4x.
  • Integrates six distinct tool types for enhanced reasoning.

Maintenance & Community

The project is actively maintained with recent updates in July 2025, including the release of the ARPO training accelerator. It welcomes community contributions. Contact is available via email at dongguanting@ruc.edu.cn.

Licensing & Compatibility

Released under the MIT License, which is permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The framework is noted as still under development with room for improvement. Specific version incompatibilities exist for vLLM and PyTorch. Setting up the RL and evaluation environments requires careful attention to dependency versions and API key configurations.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
22 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.