ToRA  by microsoft

Tool-integrated reasoning agent for math problem solving (ICLR'24 paper)

created 1 year ago
1,083 stars

Top 35.7% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

ToRA is a series of tool-integrated reasoning LLM agents designed to tackle complex mathematical problems by leveraging external computational and symbolic tools. It aims to combine the natural language understanding of LLMs with the precision of external tools, benefiting researchers and developers working on advanced AI reasoning capabilities.

How It Works

ToRA agents integrate natural language reasoning with tool execution, interleaving rationales with program-based tool use. This approach allows LLMs to offload complex calculations and symbolic manipulations to specialized tools, enhancing accuracy and efficiency. The training pipeline involves imitation learning and output space shaping to optimize tool interaction.

Quick Start & Requirements

  • Install: Clone the repository, create a Conda environment (conda create -n tora python=3.10), activate it, install PyTorch with CUDA 11.8 support (pip install torch==2.0.1 --index-url https://download.pytorch.org/whl/cu118), and install requirements (pip install -r requirements.txt).
  • Prerequisites: Python 3.10, Conda, CUDA 11.8 (or compatible), vLLM (0.1.4) for inference acceleration.
  • Setup Time: Estimated to be under 30 minutes for environment setup.
  • Links: Website, Paper, HuggingFace Models, GitHub

Highlighted Details

  • Achieves state-of-the-art results on mathematical reasoning benchmarks like GSM8k and MATH.
  • ToRA-Code-34B is the first open-source model to exceed 50% accuracy on the MATH dataset (pass@1).
  • Offers pre-trained models ranging from 7B to 70B parameters, including code-specific variants.
  • Provides open-source training scripts and example data for custom model development.

Maintenance & Community

  • Actively maintained by Microsoft.
  • Open to contributions with a Contributor License Agreement (CLA).
  • Follows the Microsoft Open Source Code of Conduct.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The ToRA-Corpus-16k dataset is currently under internal review for open-sourcing. While training scripts are available, the specific dataset for training is not yet publicly released.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
19 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.