InternLM-Math  by InternLM

Math LLM for bilingual reasoning tasks

created 1 year ago
518 stars

Top 61.5% on sourcepulse

GitHubView on GitHub
Project Summary

InternLM-Math provides state-of-the-art bilingual Large Language Models (LLMs) specifically designed for mathematical reasoning. It functions as a solver, prover, verifier, and augmentor, targeting researchers and developers working on advanced mathematical AI capabilities. The models offer significant improvements in both informal and formal mathematical reasoning tasks.

How It Works

InternLM-Math models are continued pre-trained on approximately 100 billion math-related tokens and then fine-tuned on a large bilingual math dataset. A key innovation is the integration of Lean 4, a theorem proving language, enabling the models to generate Lean code for reasoning tasks and act as a reward model for verifying reasoning steps. This approach allows for verifiable mathematical reasoning and auto-formalization.

Quick Start & Requirements

  • Inference: LMDeploy (>=0.2.1) is recommended. Hugging Face transformers library is also supported.
  • Dependencies: PyTorch, Transformers.
  • Resources: Models range from 1.8B to 8x22B parameters, requiring appropriate GPU memory.
  • Links: Paper, Hugging Face Checkpoints, Demo.

Highlighted Details

  • Achieves state-of-the-art performance on benchmarks like MiniF2F-test (43.4% for InternLM2-Math-Plus-7B), MATH (68.5 with Python for InternLM2-Math-Plus-Mixtral8x22B), and GSM8K (91.8 for InternLM2-Math-Plus-Mixtral8x22B).
  • Supports formal math reasoning via Lean 4 integration, including theorem proving and code generation.
  • Offers capabilities for math problem augmentation, code interpretation, and acting as a reward model for reasoning processes.
  • Models are available in various sizes (1.8B, 7B, 20B, 8x22B) and include base and chat-tuned versions.

Maintenance & Community

The project is actively developed by Shanghai AI Laboratory and associated researchers. Recent updates include new models like InternLM2.5-Step-Prover and datasets like Lean-Workbook.

Licensing & Compatibility

The project is released under the Apache 2.0 license, permitting commercial use and modification.

Limitations & Caveats

The models may exhibit issues such as jumping calculation steps, performing poorly on specific Chinese fill-in-the-blank and English multiple-choice problems, and code switching due to SFT data composition. Lean-related abilities might be limited to specific problem types.

Health Check
Last commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
12 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.