Qwen2.5-Math by QwenLM

Math LLM for solving math problems in Chinese and English

Created 1 year ago

1,060 stars

Top 35.7% on SourcePulse

View on GitHub

4 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Elvis Saravia

Founder of DAIR.AI

Junyang Lin

Core Maintainer at Alibaba Qwen

Binyuan Hui

Research Scientist at Alibaba Qwen

Project Summary

Qwen2.5-Math is a series of large language models specifically designed for solving mathematical problems in both English and Chinese. Targeting researchers and developers working on mathematical AI, it offers significant performance improvements over its predecessor by supporting Chain-of-Thought (CoT) and Tool-integrated Reasoning (TIR) methods.

How It Works

Qwen2.5-Math models leverage advanced reasoning techniques, including CoT for step-by-step problem-solving and TIR for integrating external tools like code interpreters. This dual approach allows for more robust and accurate solutions to complex mathematical tasks, outperforming previous models on various benchmarks. The models are available in base and instruction-tuned variants, with a dedicated reward model (RM) for enhanced performance.

Quick Start & Requirements

Installation: Use Hugging Face transformers library (version >= 4.37.0).
Dependencies: transformers, torch, vllm (for evaluation). Specific versions are crucial for reproducing results.
Resources: GPU memory requirements are comparable to Qwen2. See speed benchmark.
Documentation: Qwen Chat, Documentation.

Highlighted Details

Supports both English and Chinese mathematical problem-solving.
Achieves state-of-the-art performance on benchmarks like GSM8K, MATH, and GaoKao Math QA.
Qwen2.5-Math-72B-Instruct demonstrates strong capabilities on challenging exams like AIME 2024 and AMC 2023.
Offers a dedicated Qwen2.5-Math-RM-72B reward model for further performance tuning.

Maintenance & Community

Developed by the Qwen team.
Community support available via Discord and WeChat.

Licensing & Compatibility

The specific license is not detailed in the README, but typically Qwen models are released under a permissive license allowing commercial use. Compatibility with closed-source linking is generally expected.

Limitations & Caveats

Primarily designed for mathematical tasks; performance on other tasks is not recommended.
Reproducing evaluation results requires strict adherence to specified dependency versions.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days