MathBlackBox  by trotsky1997

Research paper for mathematical reasoning via LLMs

created 1 year ago
1,028 stars

Top 37.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides an open-source reimplementation of OpenAI's O1 model, focusing on advanced mathematical reasoning for Olympiad-level problems. It targets researchers and developers interested in replicating or building upon state-of-the-art LLM capabilities in complex mathematical domains, offering a framework for self-refinement and tree-based search.

How It Works

The project utilizes a Monte Carlo Tree Search (MCTS) approach combined with self-refinement techniques, specifically the MCTSr algorithm and its enhancement, LLaMA-Berry. This method iteratively explores potential solution paths, refines intermediate steps, and uses an "early stopping" mechanism based on a check function to identify correct answers, aiming for efficient and accurate problem-solving.

Quick Start & Requirements

  • Install: pip install vllm datasets transformers openai
  • Prerequisites: Requires an OpenAI-compatible inference server (e.g., vLLM) and Huggingface libraries. Slurm environment is supported for distributed runs, with configuration in make_n_server.py.
  • Usage: Run scripts like run_with_earlystopping.py or run_olympics.py, specifying model and dataset names.
  • Resources: High computational resources are recommended for optimal performance; disabling early stopping can significantly increase runtime.
  • Docs: LLaMA-Berry Preprint, MCTSr Preprint

Highlighted Details

  • Implements LLaMA-Berry, an upgraded version of MCTSr for Olympiad-level math.
  • Supports multiple datasets including GSM8K, MATH, and AIME.
  • Features an "early stopping" mechanism to halt search upon finding the correct answer.
  • Claims to achieve GPT-4 level mathematical reasoning.

Maintenance & Community

The project is actively developing, with recent updates announcing new phases and preprints. It calls for contributors. Links to related projects and datasets are provided.

Licensing & Compatibility

The repository's licensing is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is in a very early stage of exploration and is intended for personal experimentation only. Users are cautioned against deploying it to real-world products without thorough testing, and the algorithm's output should be carefully reviewed.

Health Check
Last commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
13 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.