Collection of technical reports on slow thinking with LLMs
Top 49.1% on sourcepulse
This repository provides a collection of technical reports and open-sourced models focused on enhancing Large Language Model (LLM) reasoning capabilities through "slow-thinking" techniques. It targets researchers and developers aiming to improve LLM performance on complex tasks like mathematical problem-solving and information retrieval, offering novel frameworks and reproducible results.
How It Works
The project explores various methods to elicit and improve "slow-thinking" or step-by-step reasoning in LLMs. Key approaches include reward-guided tree search, outcome-based reinforcement learning (RL) for search capabilities, knowledge distillation, self-distillation, and tool manipulation. These techniques aim to guide LLMs through more deliberate and structured reasoning processes, reducing hallucinations and improving accuracy on challenging benchmarks.
Quick Start & Requirements
transformers
and vllm
for loading and running models.transformers
, and vllm
. Specific models may have varying hardware requirements (e.g., vllm
with tensor_parallel_size=8
suggests multi-GPU usage).vllm
example specifies gpu_memory_utilization=0.95
and max_model_len=int(1.5 * 20000)
, indicating significant GPU memory and VRAM are needed.Highlighted Details
Maintenance & Community
The project is actively updated with recent reports and model releases (as of April 2025). Links to Hugging Face and Notion pages are provided for specific projects.
Licensing & Compatibility
The README does not explicitly state a single overarching license for the repository's content. Individual models and datasets may have different licenses, with some explicitly open-sourced for research purposes. Compatibility for commercial use is not specified.
Limitations & Caveats
The project acknowledges that its exploration is preliminary, with a capacity gap compared to industry-level systems. Future work focuses on scaling training approaches and extending capabilities to more complex tasks. Some models are released as previews or for research purposes only.
1 month ago
1 week