Discover and explore top open-source AI tools and projects—updated daily.
WeiboAISmall model, big logic: diversity-driven optimization for advanced reasoning
New!
Top 60.3% on SourcePulse
Summary
VibeThinker-1.5B is a 1.5 billion-parameter model demonstrating that small models can achieve robust reasoning capabilities. It targets engineers and researchers seeking highly efficient, cost-effective reasoning solutions. The project delivers state-of-the-art performance in mathematical and coding tasks with a fraction of the parameters and training cost of leading models.
How It Works
The model uses the "Spectrum-to-Signal Principle (SSP)" post-training methodology. This involves "Two-Stage Diversity-Exploring Distillation" in SFT to generate diverse solutions, followed by "MaxEnt-Guided Policy Optimization (MGPO)" in RL to amplify correct reasoning signals. This approach enhances logical deduction abilities in a compact model.
Quick Start & Requirements
Requires transformers>=4.54.0; vLLM==0.10.1 or SGLang>=0.4.9.post6 recommended for inference. Model checkpoints are available via Hugging Face and ModelScope. Evaluation programs and sample responses are prepared. Recommended usage parameters: temperature 0.6/1.0, max token length 40960, top_p 0.95, top_k -1 (for vLLM/SGLang). Direct links to these resources are not provided in the README.
Highlighted Details
Maintenance & Community
The README does not specify community channels or provide details on ongoing maintenance or active contributors beyond the paper authors.
Licensing & Compatibility
Licensed under the MIT License, which is generally permissive for commercial use and integration into closed-source projects.
Limitations & Caveats
The model is explicitly recommended for competitive math and coding problems. Its performance or limitations on other task types are not detailed. No information is provided on known bugs or unsupported platforms.
1 week ago
Inactive
OFA-Sys
SamsungSAILMontreal
deepseek-ai