DeepSeek-Math by deepseek-ai

Math reasoning model for competition-level problems

Created 1 year ago

3,119 stars

Top 15.2% on SourcePulse

View on GitHub

3 Experts Love This Project

Elvis Saravia

Founder of DAIR.AI

Edward Sun

Research Scientist at Meta Superintelligence Lab

Binyuan Hui

Research Scientist at Alibaba Qwen

Project Summary

DeepSeekMath provides open-source 7B parameter language models specifically trained for advanced mathematical reasoning. Targeting researchers and developers, these models offer strong performance on benchmarks like MATH, approaching proprietary model capabilities without external toolkits, and also demonstrate robust tool-use and coding abilities.

How It Works

DeepSeekMath models are initialized from DeepSeek-Coder-v1.5 7B and further pre-trained on a massive 500B token dataset comprising mathematical web texts, natural language, and code. This extensive training, particularly on curated mathematical content from Common Crawl, imbues the models with superior mathematical reasoning capabilities. The instruct and RL variants are fine-tuned for better instruction following and optimized using a Group Relative Policy Optimization (GRPO) algorithm, respectively.

Quick Start & Requirements

Install/Run: Use Hugging Face's Transformers library.
Prerequisites: Python, PyTorch. torch_dtype=torch.bfloat16 is recommended for inference.
Resources: Requires sufficient VRAM for a 7B model (e.g., ~14GB for bfloat16).
Docs: Quick Start

Highlighted Details

Achieves 51.7% on the MATH benchmark (few-shot CoT), outperforming other open-source models by >10%.
Demonstrates strong tool-use capabilities for solving and proving mathematical problems.
Offers comparable reasoning and coding performance to its base model, DeepSeekCoder-7B.
The RL version approaches 60% accuracy on MATH with tool use.

Maintenance & Community

Developed by DeepSeek AI.
Contact: service@deepseek.com or GitHub issues.

Licensing & Compatibility

Code licensed under MIT.
Model use subject to a separate Model License.
Commercial use is permitted.

Limitations & Caveats

The README advises against using system prompts for instruct/RL models and recommends specific chain-of-thought prompting formats for optimal results.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

44 stars in the last 30 days