DeepSeek-Prover-V2  by deepseek-ai

LLM for formal theorem proving in Lean 4, initialized with DeepSeek-V3 data

Created 4 months ago
1,188 stars

Top 32.8% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

DeepSeek-Prover-V2 is an open-source large language model designed for formal theorem proving in Lean 4. It aims to advance formal mathematical reasoning by integrating informal reasoning with formal proof construction, targeting researchers and developers in automated theorem proving and formal verification.

How It Works

The model utilizes a recursive theorem proving pipeline powered by DeepSeek-V3. It first prompts DeepSeek-V3 to decompose complex theorems into subgoals and formalize these steps in Lean 4. A smaller 7B model handles proof search for individual subgoals. Successful proofs are then combined with DeepSeek-V3's chain-of-thought reasoning to create a cold-start dataset. This dataset is used for fine-tuning, followed by a reinforcement learning stage that uses binary correct-or-incorrect feedback to enhance the model's ability to bridge informal reasoning with formal proof generation.

Quick Start & Requirements

  • Install/Run: Use Huggingface's Transformers library.
  • Prerequisites: Python, PyTorch. The 671B model requires significant computational resources.
  • Resources: The 671B model is based on DeepSeek-V3 architecture. The 7B model has an extended context length of 32K tokens.
  • Links: Model Download, Dataset Download, DeepSeek-V3 Docs

Highlighted Details

  • Achieves 88.9% pass ratio on MiniF2F-test and solves 49/658 problems on PutnamBench.
  • Introduces ProverBench, a benchmark with 325 formalized problems including AIME competition math.
  • Available in 7B and 671B parameter sizes.
  • The 7B model is built on DeepSeek-Prover-V1.5-Base with a 32K token context length.

Maintenance & Community

Licensing & Compatibility

  • Subject to the Model License. Specific terms are not detailed in the README.

Limitations & Caveats

The README does not specify any limitations or caveats regarding the model's performance, stability, or potential biases. The 671B model's substantial size implies significant hardware requirements for inference.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Dan Abramov Dan Abramov(Core Contributor to React; Coauthor of Redux, Create React App), Thomas Wolf Thomas Wolf(Cofounder of Hugging Face), and
1 more.

LeanCopilot by lean-dojo

0.2%
1k
Copilot for theorem proving in Lean
Created 2 years ago
Updated 3 days ago
Starred by Dan Abramov Dan Abramov(Core Contributor to React; Coauthor of Redux, Create React App) and Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab).

LeanDojo by lean-dojo

0%
705
Machine learning for theorem proving in Lean
Created 2 years ago
Updated 5 days ago
Feedback? Help us improve.