MetaMath by meta-math

Math question generation for LLM training and evaluation

Created 2 years ago

454 stars

Top 66.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

MetaMath provides open-source models and datasets for improving Large Language Model (LLM) performance on mathematical reasoning tasks. It targets researchers and developers seeking to enhance LLMs' capabilities in solving math problems, offering significant performance gains on benchmarks like GSM8k and MATH.

How It Works

MetaMath employs a data augmentation strategy to generate high-quality mathematical questions, effectively bootstrapping the LLM's learning process. This approach, inspired by existing works like WizardMath and RFT, focuses on creating a diverse and challenging dataset to fine-tune base LLMs. The resulting models demonstrate superior performance compared to other open-source LLMs of similar scales.

Quick Start & Requirements

Install via pip install -r requirements.txt after cloning the repository.
Requires Python and potentially specific versions of ray and pyarrow.
Dataset loading is supported via Hugging Face datasets library.
Training requires base models (e.g., Llama-2) and the MetaMathQA dataset.
Evaluation utilizes vllm for fast generation.
Official models are available on Hugging Face.

Highlighted Details

MetaMath-Mistral-7B achieves 77.7 pass@1 on GSM8k, outperforming SOTA open-source LLMs.
MetaMath-Llemma-7B reaches 30.0 pass@1 on MATH benchmarks, leading in its scale.
MetaMath-70B surpasses ChatGPT 3.5 on GSM8k, despite data augmentation from ChatGPT 3.5.
Comprehensive results and comparisons with numerous other LLMs are provided.

Maintenance & Community

The project is actively maintained, with releases of models and datasets.
Code is based on WizardMath and RFT.
Citation details for the associated paper are provided.

Licensing & Compatibility

MetaMath-7B and MetaMath-13B models are released under the Llama 2 license.
MetaMath-Mistral-7B and MetaMath-Llemma-7B models are released under the Apache License 2.0.
Users should consult the respective licenses for compatibility and usage restrictions.

Limitations & Caveats

The data augmentation for MetaMathQA was sourced from ChatGPT 3.5, which might introduce certain biases or limitations inherited from the source model. Specific hardware requirements for training, such as multi-GPU setups, are implied by the provided training script.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days