Discover and explore top open-source AI tools and projects—updated daily.
Dataset for advancing LLM mathematical reasoning
Top 99.8% on SourcePulse
DeepMath provides DeepMath-103K, a large-scale, challenging, decontaminated, and verifiable mathematical dataset designed to advance reasoning in language models. It targets researchers and practitioners in AI mathematics, particularly those using Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT), offering a robust benchmark for evaluating and improving model capabilities.
How It Works
The core is the DeepMath-103K dataset, featuring difficult problems (Levels 5-9) across diverse mathematical subjects like Algebra, Calculus, and Number Theory. Its advantages include rigorous decontamination against common benchmarks to minimize test set leakage and a rich data format. This format includes verifiable final answers crucial for RL reward functions, difficulty scores, hierarchical topic classifications, and multiple reasoning paths for SFT or distillation, ensuring data novelty and supporting varied research applications.
Quick Start & Requirements
Setup involves cloning the repository (git clone --recurse-submodules
), creating a Python 3.12.2 Conda environment, and installing numerous packages including PyTorch 2.5.1 with CUDA 12.4, flash-attn, vllm, and Ray. Significant GPU resources are implied. Key resources include the dataset on Hugging Face, model weights, the code repository, and the accompanying paper. Data preparation scripts and evaluation examples are available within the repository.
Highlighted Details
Maintenance & Community
The project appears actively maintained, with recent news indicating updates to the dataset. However, the README lacks explicit community channel links or detailed contributor/sponsorship information.
Licensing & Compatibility
The README does not specify a software license. While hosted on GitHub and Hugging Face, suggesting open-source availability, users should verify terms for commercial use or integration into closed-source projects.
Limitations & Caveats
Recently, 48 samples with answer hints were identified and revised, highlighting potential data integrity issues that have since been addressed. The extensive, version-specific dependencies may complicate setup.
https://huggingface.co/datasets/zwhe99/DeepMath-103K https://huggingface.co/collections/zwhe99/deepmath-6816e139b7f467f21a459a9a https://github.com/zwhe99/DeepMath https://arxiv.org/abs/2504.11456
3 months ago
Inactive