MathGLM by THUDM

PyTorch implementation for math problem-solving LLM research

Created 2 years ago

329 stars

Top 83.2% on SourcePulse

Project Summary

This repository provides the official PyTorch implementation for MathGLM, a family of large language models designed to excel at mathematical tasks, including arithmetic operations and word problems. It challenges the notion that LLMs struggle with calculations, demonstrating high accuracy on multi-digit arithmetic and competitive performance on math word problems, targeting researchers and developers working with LLMs for quantitative reasoning.

How It Works

MathGLM models are fine-tuned from existing GLM architectures (e.g., GLM-10B, ChatGLM) on specialized datasets containing multi-step arithmetic operations and text-based math problems. This approach aims to imbue the models with robust quantitative reasoning capabilities, enabling them to perform complex calculations and solve word problems without external tools, thereby surpassing the arithmetic accuracy of models like GPT-4 in specific benchmarks.

Quick Start & Requirements

Install: Clone the repository and create a conda environment using conda env create -f env.yml.
Prerequisites: PyTorch, deepspeed (v0.6.0 for arithmetic tasks, v0.9.5 for 6B MWP tasks), and SwissArmyTransformer.
Data: Download pre-training datasets for arithmetic tasks or use the provided reconstructed Ape210K dataset for word problems.
Inference: Run ./inference.sh within MathGLM_Arithmetic or MathGLM_MWP directories.
Links: ModelScope, Paper

Highlighted Details

MathGLM-2B achieves 93.03% accuracy on multi-digit arithmetic tasks.
MathGLM-10B demonstrates comparable performance to GPT-4 on a Chinese math word problem test set.
Offers various model sizes from 10M to 10B parameters for different task requirements.
Includes scripts for pre-training and continuing training.

Maintenance & Community

The project is associated with THUDM (Tsinghua University Knowledge Engineering Group). Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The models are available via THU-Cloud and ModelScope. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README notes specific deepspeed version requirements for different tasks, indicating potential dependency sensitivity. Performance on English math word problems is not detailed.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days