Slow_Thinking_with_LLMs  by RUCAIBox

Collection of technical reports on slow thinking with LLMs

created 7 months ago
713 stars

Top 49.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a collection of technical reports and open-sourced models focused on enhancing Large Language Model (LLM) reasoning capabilities through "slow-thinking" techniques. It targets researchers and developers aiming to improve LLM performance on complex tasks like mathematical problem-solving and information retrieval, offering novel frameworks and reproducible results.

How It Works

The project explores various methods to elicit and improve "slow-thinking" or step-by-step reasoning in LLMs. Key approaches include reward-guided tree search, outcome-based reinforcement learning (RL) for search capabilities, knowledge distillation, self-distillation, and tool manipulation. These techniques aim to guide LLMs through more deliberate and structured reasoning processes, reducing hallucinations and improving accuracy on challenging benchmarks.

Quick Start & Requirements

  • Install/Run: The README provides a Python code snippet using transformers and vllm for loading and running models.
  • Prerequisites: Requires Python, transformers, and vllm. Specific models may have varying hardware requirements (e.g., vllm with tensor_parallel_size=8 suggests multi-GPU usage).
  • Resources: The vllm example specifies gpu_memory_utilization=0.95 and max_model_len=int(1.5 * 20000), indicating significant GPU memory and VRAM are needed.
  • Links: Project pages and Hugging Face model repositories are linked for specific components.

Highlighted Details

  • SimpleDeepSearcher: Framework for autonomous web search using knowledge distillation, outperforming RL approaches.
  • OlymMATH: A challenging benchmark of 200 Olympiad-level math problems in English and Chinese, highlighting LLM limitations.
  • R1-Searcher: RL-based approach for LLM search capabilities without distillation or SFT.
  • STILL-3-Tool-32B: Achieves 81.70% accuracy on AIME 2024 using Python code and tool manipulation.
  • Virgo: A multimodal slow-thinking model demonstrating transferability of reasoning from text to vision.

Maintenance & Community

The project is actively updated with recent reports and model releases (as of April 2025). Links to Hugging Face and Notion pages are provided for specific projects.

Licensing & Compatibility

The README does not explicitly state a single overarching license for the repository's content. Individual models and datasets may have different licenses, with some explicitly open-sourced for research purposes. Compatibility for commercial use is not specified.

Limitations & Caveats

The project acknowledges that its exploration is preliminary, with a capacity gap compared to industry-level systems. Future work focuses on scaling training approaches and extending capabilities to more complex tasks. Some models are released as previews or for research purposes only.

Health Check
Last commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
59 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.