r1-overthinker by qunash

Gradio app for extending DeepSeek R1 reasoning

Created 11 months ago

373 stars

Top 76.0% on SourcePulse

Project Summary

This project enables users to extend the reasoning capabilities of DeepSeek R1 models, allowing them to "overthink" and produce more thorough responses. It targets researchers and power users seeking deeper insights from LLMs by providing fine-grained control over the generation process and supporting unlimited context length, limited only by VRAM.

How It Works

The core mechanism involves intercepting early model conclusions and replacing them with prompts that encourage further deliberation. This "budget forcing" technique, validated by the independent "s1: Simple test-time scaling" paper, allows for controlled extension of the model's thinking process until a user-defined threshold is met. It leverages unsloth-optimized models for enhanced performance and VRAM efficiency.

Quick Start & Requirements

Install via pip install -e .
Requires Python 3.10+ and PyTorch.
Supports various DeepSeek R1 models (1.5B to 70B parameters), including Qwen and LLaMA architectures.
Models up to 14B parameters can run on a free Google Colab T4 GPU.
See unsloth for optimization details.

Highlighted Details

Forces models to think longer and more thoroughly.
Customizable reasoning extensions and thinking thresholds.
Fine-grained control over model parameters (temperature, top-p).
Visible thinking process with token count tracking.

Maintenance & Community

Developed by anzorq.
Credits original idea to vgel's gist.
Utilizes unsloth for optimization and Gradio for the app interface.

Licensing & Compatibility

MIT License.
Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The effectiveness of "overthinking" may vary depending on the specific model and task. The project relies on unsloth optimizations, which might introduce specific dependencies or behaviors.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

0 stars in the last 30 days

Explore Similar Projects

Starred by

Wing Lian

Wing Lian(Founder of Axolotl AI) and

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory).

BAdam by Ledzy

Memory-efficient optimizer for large language model finetuning

Created 1 year ago

Updated 10 months ago

Starred by

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n).

flex-nano-vllm by changjonathanc

Fast Gemma 2 inference engine

Created 5 months ago

Updated 2 months ago

Starred by

Yineng Zhang

Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

dots.llm1 by rednote-hilab

MoE model for research

Created 8 months ago

Updated 4 months ago

Starred by

Yineng Zhang

Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

duo-attention by mit-han-lab

Framework for efficient long-context LLM inference

Created 1 year ago

Updated 11 months ago

Starred by

Wing Lian

Wing Lian(Founder of Axolotl AI).

buffer-of-thought-llm by YangLing0818

Research paper implementation for thought-augmented LLM reasoning

Created 1 year ago

Updated 6 months ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera), and

7 more.

LLMLingua by microsoft

Prompt compression for accelerated LLM inference

Created 2 years ago

Updated 2 months ago

Starred by

Georgios Konstantopoulos

Georgios Konstantopoulos(CTO, General Partner at Paradigm),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

5 more.

streaming-llm by mit-han-lab

Framework for efficient LLM streaming

Created 2 years ago

Updated 1 year ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind) and

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory).

Baichuan-7B by baichuan-inc

7B-parameter LLM for commercial use

Created 2 years ago

Updated 1 year ago

Starred by

Jason Knight

Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML),

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind), and

12 more.

mistral.rs by EricLBuehler

LLM inference engine for blazing fast performance

Created 1 year ago

Updated 2 days ago

Starred by

Lianmin Zheng

Lianmin Zheng(Coauthor of SGLang, vLLM),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

1 more.

MiniCPM by OpenBMB

Ultra-efficient LLMs for end devices, achieving 5x+ speedup

Created 1 year ago

Updated 3 months ago

Starred by

Georgios Konstantopoulos

Georgios Konstantopoulos(CTO, General Partner at Paradigm),

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI), and

2 more.

DeepSeek-V2 by deepseek-ai

MoE language model for research/API use

Created 1 year ago

Updated 1 year ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and

41 more.

unsloth by unslothai

Finetuning tool for LLMs, targeting speed and memory efficiency

Created 2 years ago

Updated 1 day ago

Feedback? Help us improve.