MathBlackBox by trotsky1997

Research paper for mathematical reasoning via LLMs

Created 1 year ago

1,032 stars

Top 36.3% on SourcePulse

View on GitHub

6 Experts Love This Project

Bryan Helmig

Cofounder of Zapier

Philipp Schmid

DevRel at Google DeepMind

Wing Lian

Founder of Axolotl AI

Edward Sun

Research Scientist at Meta Superintelligence Lab

and 2 more!

Project Summary

This repository provides an open-source reimplementation of OpenAI's O1 model, focusing on advanced mathematical reasoning for Olympiad-level problems. It targets researchers and developers interested in replicating or building upon state-of-the-art LLM capabilities in complex mathematical domains, offering a framework for self-refinement and tree-based search.

How It Works

The project utilizes a Monte Carlo Tree Search (MCTS) approach combined with self-refinement techniques, specifically the MCTSr algorithm and its enhancement, LLaMA-Berry. This method iteratively explores potential solution paths, refines intermediate steps, and uses an "early stopping" mechanism based on a check function to identify correct answers, aiming for efficient and accurate problem-solving.

Quick Start & Requirements

Install: pip install vllm datasets transformers openai
Prerequisites: Requires an OpenAI-compatible inference server (e.g., vLLM) and Huggingface libraries. Slurm environment is supported for distributed runs, with configuration in make_n_server.py.
Usage: Run scripts like run_with_earlystopping.py or run_olympics.py, specifying model and dataset names.
Resources: High computational resources are recommended for optimal performance; disabling early stopping can significantly increase runtime.
Docs: LLaMA-Berry Preprint, MCTSr Preprint

Highlighted Details

Implements LLaMA-Berry, an upgraded version of MCTSr for Olympiad-level math.
Supports multiple datasets including GSM8K, MATH, and AIME.
Features an "early stopping" mechanism to halt search upon finding the correct answer.
Claims to achieve GPT-4 level mathematical reasoning.

Maintenance & Community

The project is actively developing, with recent updates announcing new phases and preprints. It calls for contributors. Links to related projects and datasets are provided.

Licensing & Compatibility

The repository's licensing is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is in a very early stage of exploration and is intended for personal experimentation only. Users are cautioned against deploying it to real-world products without thorough testing, and the algorithm's output should be carefully reviewed.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days