awesome-deep-reasoning  by modelscope

Collection of resources for reasoning models

created 6 months ago
399 stars

Top 73.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated collection of resources, papers, models, and infrastructure related to "deep reasoning" in Large Language Models (LLMs), specifically focusing on advancements inspired by models like OpenAI's O1 and DeepSeek's R1. It serves researchers and developers aiming to build LLMs capable of complex, step-by-step reasoning, particularly in domains like mathematics, coding, and multimodal understanding.

How It Works

The collection highlights research and models that employ techniques such as Reinforcement Learning (RL), Chain-of-Thought (CoT) prompting, and specialized fine-tuning to enhance LLM reasoning capabilities. Key approaches include process reward models (PRMs), reinforcement learning from human feedback (RLHF), and efficient training infrastructure like DualPipe and FlashMLA for handling complex reasoning tasks and large models.

Quick Start & Requirements

This is a collection, not a single runnable project. Specific models and infrastructure components will have their own installation and usage instructions. Links to official repositories for models (e.g., DeepSeek-R1, Qwen) and infrastructure (e.g., Hugging Face's open-r1, DeepSeek's FlashMLA) are provided. Requirements vary significantly by component, often including Python, PyTorch, and potentially specific GPU hardware (e.g., Hopper GPUs for FlashMLA).

Highlighted Details

  • Comprehensive listing of papers, models (DeepSeek-R1, Qwen series, S1), and datasets (OpenR1-Math, LLaVA-R1-100k) related to deep reasoning.
  • Details on training infrastructure and techniques, including RL algorithms (GRPO, DAPO), efficient kernels (DeepGEMM), and distributed systems (3FS).
  • Focus on multimodal reasoning, agent-based reasoning, and applications in competitive programming and tool use.
  • Links to official reproductions and related projects for various reasoning models and frameworks.

Maintenance & Community

The repository is actively updated with recent research and model releases, indicated by frequent "News" updates. It links to various GitHub repositories and platforms like Hugging Face and ModelScope, suggesting a broad community engagement.

Licensing & Compatibility

The repository itself is a collection of links and information; licensing depends on the individual projects and models referenced. Many linked models and datasets are open-source, but specific licenses (e.g., Apache 2.0, MIT) should be checked for each component. Compatibility for commercial use will vary.

Limitations & Caveats

As a curated list, this repository does not provide a unified API or single point of execution. Users must navigate to individual linked projects for specific functionalities, dependencies, and usage instructions. The rapid pace of development in this field means some linked resources may become outdated or superseded.

Health Check
Last commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
46 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.