Skywork-OR1  by SkyworkAI

Math/code reasoner models trained with RL

Created 5 months ago
713 stars

Top 48.1% on SourcePulse

GitHubView on GitHub
Project Summary

Skywork-OR1 provides a series of powerful math and code reasoning large language models, including specialized math models and general-purpose reasoning models. It targets researchers and developers seeking to advance the state-of-the-art in AI reasoning capabilities, offering strong performance on benchmarks like AIME and LiveCodeBench.

How It Works

The models are trained using large-scale rule-based reinforcement learning, leveraging carefully curated datasets and training recipes. This approach aims to enhance logical deduction and problem-solving abilities in both mathematical and coding domains, distinguishing itself through a multi-stage training pipeline and a novel evaluation metric, Avg@K, for more robust performance assessment.

Quick Start & Requirements

  • Installation: Docker (docker pull whatcanyousee/verl:vemlp-th2.4.0-cu124-vllm0.6.3-ray2.10-te2.0-megatron0.11.0-v0.0.6) or Conda (conda create -n verl python==3.10, pip3 install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu124, pip3 install flash-attn --no-build-isolation).
  • Prerequisites: NVIDIA GPU with CUDA 12.4, Python 3.10.
  • Setup: Requires cloning the repository and installing dependencies. Multi-node training is supported via Ray.
  • Links: Models, Data, Code, Notion Blog.

Highlighted Details

  • Skywork-OR1-Math-7B achieves 69.8 on AIME24 and 52.3 on AIME25 (Avg@32).
  • Skywork-OR1-32B-Preview matches DeepSeek-R1's performance on math and coding tasks.
  • Skywork-OR1-7B-Preview outperforms similarly sized models in math and coding.
  • Introduces Avg@K as a more robust evaluation metric than Pass@1.

Maintenance & Community

The project is actively maintained by SkyworkAI. Community resources include a GitHub repository and a Notion blog detailing training recipes and experimental results.

Licensing & Compatibility

The models are trained on top of DeepSeek-R1-Distill models and use a custom fork of the verl project. Specific licensing details for Skywork-OR1 models are not explicitly stated in the README, but the underlying components may have their own licenses.

Limitations & Caveats

The README mentions "Preview" for some models, indicating they may not be the final release versions. A technical report is also pending release. The project relies on a custom fork of verl, which might introduce dependencies or divergence from the original verl project.

Health Check
Last Commit

3 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 30 days

Explore Similar Projects

Starred by Michael Han Michael Han(Cofounder of Unsloth), Sebastian Raschka Sebastian Raschka(Author of "Build a Large Language Model (From Scratch)"), and
19 more.

DeepSeek-R1 by deepseek-ai

0.1%
91k
Reasoning models research paper
Created 8 months ago
Updated 2 months ago
Feedback? Help us improve.