aimo-progress-prize  by project-numina

Code for replicating a math problem-solving solution

created 1 year ago
460 stars

Top 66.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the code and datasets for a winning solution to the AI Mathematical Olympiad - Progress Prize 1. It targets researchers and engineers in AI and mathematics, offering a robust framework for fine-tuning large language models on complex math problems using tool-integrated reasoning. The solution demonstrates a novel approach to generating high-quality training data and employs a self-consistency decoding algorithm for improved accuracy.

How It Works

The solution fine-tunes the DeepSeekMath-Base 7B model in two stages: first on Chain of Thought (CoT) data, then on Tool-Integrated Reasoning (TIR) data. TIR data is generated using GPT-4 to create code execution feedback loops, enhancing the model's ability to solve problems requiring symbolic manipulation and computation. A self-consistency decoding algorithm (SC-TIR) is used during inference to generate multiple solution candidates, improving robustness.

Quick Start & Requirements

  • Installation: Create a Conda environment, install PyTorch v2.1.2, pip install -r requirements.txt, and Flash Attention 2. Log in to Hugging Face CLI.
  • Prerequisites: Python 3.10, PyTorch 2.1.2, Flash Attention 2, Hugging Face Hub access.
  • Hardware: Training requires 8 x H100 GPUs (80GB VRAM each) and 1TB RAM. Inference can be performed on T4 GPUs with 8-bit quantization.
  • Resources: Training time is approximately 10 hours on the specified hardware.
  • Links: Website, Datasets, Demo

Highlighted Details

  • Fine-tuned DeepSeekMath-Base 7B model for mathematical reasoning.
  • Two high-quality datasets: NuminaMath-CoT (~860k problems) and NuminaMath-TIR (~70k problems).
  • Self-consistency decoding algorithm with code execution feedback (SC-TIR).
  • Utilizes TRL, PyTorch, vLLM, DeepSpeed, and AutoGPTQ.
  • Achieved winning solution for AI Mathematical Olympiad - Progress Prize 1.

Maintenance & Community

The project is associated with Numina and its contributors include prominent figures in LLM research. Further details on community or roadmap are not explicitly provided in the README.

Licensing & Compatibility

The repository is licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The training process is resource-intensive, requiring a cluster of 8x H100 GPUs. While inference can be done on less powerful hardware with quantization, the full performance benefits are realized with high-end GPUs. The dataset generation pipeline relies on GPT-4, which may introduce biases or limitations inherent to the model.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
35 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

applied-ai by pytorch-labs

0.3%
289
Applied AI experiments and examples for PyTorch
created 2 years ago
updated 2 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.