MathGLM  by THUDM

PyTorch implementation for math problem-solving LLM research

created 1 year ago
322 stars

Top 85.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official PyTorch implementation for MathGLM, a family of large language models designed to excel at mathematical tasks, including arithmetic operations and word problems. It challenges the notion that LLMs struggle with calculations, demonstrating high accuracy on multi-digit arithmetic and competitive performance on math word problems, targeting researchers and developers working with LLMs for quantitative reasoning.

How It Works

MathGLM models are fine-tuned from existing GLM architectures (e.g., GLM-10B, ChatGLM) on specialized datasets containing multi-step arithmetic operations and text-based math problems. This approach aims to imbue the models with robust quantitative reasoning capabilities, enabling them to perform complex calculations and solve word problems without external tools, thereby surpassing the arithmetic accuracy of models like GPT-4 in specific benchmarks.

Quick Start & Requirements

  • Install: Clone the repository and create a conda environment using conda env create -f env.yml.
  • Prerequisites: PyTorch, deepspeed (v0.6.0 for arithmetic tasks, v0.9.5 for 6B MWP tasks), and SwissArmyTransformer.
  • Data: Download pre-training datasets for arithmetic tasks or use the provided reconstructed Ape210K dataset for word problems.
  • Inference: Run ./inference.sh within MathGLM_Arithmetic or MathGLM_MWP directories.
  • Links: ModelScope, Paper

Highlighted Details

  • MathGLM-2B achieves 93.03% accuracy on multi-digit arithmetic tasks.
  • MathGLM-10B demonstrates comparable performance to GPT-4 on a Chinese math word problem test set.
  • Offers various model sizes from 10M to 10B parameters for different task requirements.
  • Includes scripts for pre-training and continuing training.

Maintenance & Community

The project is associated with THUDM (Tsinghua University Knowledge Engineering Group). Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The models are available via THU-Cloud and ModelScope. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README notes specific deepspeed version requirements for different tasks, indicating potential dependency sensitivity. Performance on English math word problems is not detailed.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.