DeepSeek-Prover-V2  by deepseek-ai

LLM for formal theorem proving in Lean 4, initialized with DeepSeek-V3 data

created 3 months ago
1,175 stars

Top 33.8% on sourcepulse

GitHubView on GitHub
Project Summary

DeepSeek-Prover-V2 is an open-source large language model designed for formal theorem proving in Lean 4. It aims to advance formal mathematical reasoning by integrating informal reasoning with formal proof construction, targeting researchers and developers in automated theorem proving and formal verification.

How It Works

The model utilizes a recursive theorem proving pipeline powered by DeepSeek-V3. It first prompts DeepSeek-V3 to decompose complex theorems into subgoals and formalize these steps in Lean 4. A smaller 7B model handles proof search for individual subgoals. Successful proofs are then combined with DeepSeek-V3's chain-of-thought reasoning to create a cold-start dataset. This dataset is used for fine-tuning, followed by a reinforcement learning stage that uses binary correct-or-incorrect feedback to enhance the model's ability to bridge informal reasoning with formal proof generation.

Quick Start & Requirements

  • Install/Run: Use Huggingface's Transformers library.
  • Prerequisites: Python, PyTorch. The 671B model requires significant computational resources.
  • Resources: The 671B model is based on DeepSeek-V3 architecture. The 7B model has an extended context length of 32K tokens.
  • Links: Model Download, Dataset Download, DeepSeek-V3 Docs

Highlighted Details

  • Achieves 88.9% pass ratio on MiniF2F-test and solves 49/658 problems on PutnamBench.
  • Introduces ProverBench, a benchmark with 325 formalized problems including AIME competition math.
  • Available in 7B and 671B parameter sizes.
  • The 7B model is built on DeepSeek-Prover-V1.5-Base with a 32K token context length.

Maintenance & Community

Licensing & Compatibility

  • Subject to the Model License. Specific terms are not detailed in the README.

Limitations & Caveats

The README does not specify any limitations or caveats regarding the model's performance, stability, or potential biases. The 671B model's substantial size implies significant hardware requirements for inference.

Health Check
Last commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
286 stars in the last 90 days

Explore Similar Projects

Starred by Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm) and Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley).

DeepSeek-V2 by deepseek-ai

0.1%
5k
MoE language model for research/API use
created 1 year ago
updated 10 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley).

DeepSeek-Coder-V2 by deepseek-ai

0.4%
6k
Open-source code language model comparable to GPT4-Turbo
created 1 year ago
updated 10 months ago
Starred by Michael Han Michael Han(Cofounder of Unsloth), Sebastian Raschka Sebastian Raschka(Author of Build a Large Language Model From Scratch), and
6 more.

DeepSeek-R1 by deepseek-ai

0.1%
91k
Reasoning models research paper
created 6 months ago
updated 1 month ago
Feedback? Help us improve.