MathCoder by mathllm

LLM family for enhanced mathematical reasoning via code integration

Created 2 years ago

339 stars

Top 81.4% on SourcePulse

Project Summary

MathCoder is a family of LLMs and LMMs designed to enhance mathematical reasoning by integrating code generation and execution capabilities. It targets researchers and developers working on AI for mathematics, offering improved performance on complex math benchmarks.

How It Works

MathCoder models are fine-tuned using the MathCodeInstruct dataset, which interleaves natural language, code, and execution results. This approach allows the models to generate code-based solutions for mathematical problems, mirroring the functionality of tools like GPT-4's Code Interpreter. The models are trained to reason with code, execute it, and use the output for further reasoning, leading to enhanced problem-solving accuracy.

Quick Start & Requirements

Deployment: Uses Text Generation Inference (TGI) for serving models.
Inference: Requires inference.py script and TGI API endpoint.
Evaluation: Requires evaluate.py script.
Dependencies: Python, TGI. Specific hardware requirements (GPU, CUDA) are not explicitly detailed but are implied for LLM deployment.
Resources: Model weights are available on Hugging Face.

Highlighted Details

Achieves 87.7% accuracy on GSM8K and 55.7% on MATH with MathGenie.
Outperforms ChatGPT-3.5, PaLM-2, and GPT-4 on GSM8K and MATH benchmarks.
Models are based on Llama-2 and Code Llama architectures (7B, 13B, 34B variants).
MathCoder and CSV accepted at ICLR 2024.

Maintenance & Community

Models and datasets are released on Hugging Face.
Paper available at arXiv:2310.03731.
Work featured by Aran Komatsuzaki.

Licensing & Compatibility

The README does not explicitly state the license for the models or code. It mentions releasing datasets and models, implying open availability but without a specific license.

Limitations & Caveats

The README does not specify any limitations or caveats regarding the models' performance, potential biases, or unsupported mathematical domains. The licensing status is also unclear, which may impact commercial use.

MathCoder by mathllm

Explore Similar Projects

godot-dodo by minosvasilias

EXAONE-Deep by LG-AI-EXAONE

Program-of-Thoughts by TIGER-AI-Lab

llm-verified-with-monte-carlo-tree-search by namin

MetaMath by meta-math

MAmmoTH by TIGER-AI-Lab

Seed-Coder by ByteDance-Seed

cwm by facebookresearch

math-lm by EleutherAI

CodeTF by salesforce

Awesome-Code-LLM by codefuse-ai

aiXcoder-7B by aixcoder-plugin