Research paper repo for math reasoning in small LLMs via deep thinking
Top 54.9% on sourcepulse
rStar-Math enables small language models (SLMs) to achieve state-of-the-art math reasoning capabilities, rivaling larger models without requiring distillation. It targets researchers and developers working on improving LLM reasoning, offering a framework for enhanced performance through self-evolved deep thinking.
How It Works
The core innovation is "deep thinking" via Monte Carlo Tree Search (MCTS). An SLM acts as a policy model, guiding a test-time search. This search is further refined by an SLM-based process reward model, which evaluates the quality of reasoning steps. This approach allows SLMs to explore multiple reasoning paths and self-correct, leading to more robust mathematical problem-solving.
Quick Start & Requirements
pip install -r requirements.txt
. Flash-attention 2 is optional.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 weeks ago
1 week