aimo-progress-prize by project-numina

Code for replicating a math problem-solving solution

Created 1 year ago

478 stars

Top 64.0% on SourcePulse

View on GitHub

4 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Yiran Wu

Coauthor of AutoGen

Philipp Schmid

DevRel at Google DeepMind

Lewis Tunstall

Research Engineer at Hugging Face

Project Summary

This repository provides the code and datasets for a winning solution to the AI Mathematical Olympiad - Progress Prize 1. It targets researchers and engineers in AI and mathematics, offering a robust framework for fine-tuning large language models on complex math problems using tool-integrated reasoning. The solution demonstrates a novel approach to generating high-quality training data and employs a self-consistency decoding algorithm for improved accuracy.

How It Works

The solution fine-tunes the DeepSeekMath-Base 7B model in two stages: first on Chain of Thought (CoT) data, then on Tool-Integrated Reasoning (TIR) data. TIR data is generated using GPT-4 to create code execution feedback loops, enhancing the model's ability to solve problems requiring symbolic manipulation and computation. A self-consistency decoding algorithm (SC-TIR) is used during inference to generate multiple solution candidates, improving robustness.

Quick Start & Requirements

Installation: Create a Conda environment, install PyTorch v2.1.2, pip install -r requirements.txt, and Flash Attention 2. Log in to Hugging Face CLI.
Prerequisites: Python 3.10, PyTorch 2.1.2, Flash Attention 2, Hugging Face Hub access.
Hardware: Training requires 8 x H100 GPUs (80GB VRAM each) and 1TB RAM. Inference can be performed on T4 GPUs with 8-bit quantization.
Resources: Training time is approximately 10 hours on the specified hardware.
Links: Website, Datasets, Demo

Highlighted Details

Fine-tuned DeepSeekMath-Base 7B model for mathematical reasoning.
Two high-quality datasets: NuminaMath-CoT (~860k problems) and NuminaMath-TIR (~70k problems).
Self-consistency decoding algorithm with code execution feedback (SC-TIR).
Utilizes TRL, PyTorch, vLLM, DeepSpeed, and AutoGPTQ.
Achieved winning solution for AI Mathematical Olympiad - Progress Prize 1.

Maintenance & Community

The project is associated with Numina and its contributors include prominent figures in LLM research. Further details on community or roadmap are not explicitly provided in the README.

Licensing & Compatibility

The repository is licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The training process is resource-intensive, requiring a cluster of 8x H100 GPUs. While inference can be done on less powerful hardware with quantization, the full performance benefits are realized with high-end GPUs. The dataset generation pipeline relies on GPT-4, which may introduce biases or limitations inherent to the model.

Health Check

Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days