DeepSeek-Prover-V1.5  by deepseek-ai

Theorem prover for formal mathematics using Lean 4

created 11 months ago
531 stars

Top 60.4% on sourcepulse

GitHubView on GitHub
Project Summary

DeepSeek-Prover-V1.5 is an open-source language model for theorem proving in Lean 4, designed to improve upon its predecessor by optimizing training and inference. It targets researchers and practitioners in formal mathematics and AI, offering state-of-the-art performance on benchmarks like miniF2F and ProofNet.

How It Works

The model is pre-trained on formal mathematical languages and fine-tuned with supervised learning and reinforcement learning from proof assistant feedback (RLPAF). A key innovation is RMaxTS, a Monte-Carlo tree search variant employing intrinsic-reward-driven exploration for diverse proof path generation, moving beyond single-pass proof generation.

Quick Start & Requirements

  • Install: Clone repo, install dependencies (pip install -r requirements.txt), set up Lean 4, build Mathlib4.
  • Prerequisites: Linux, Python 3.10, Lean 4.
  • Inference: Use Huggingface Transformers.
  • Experiments: python -m prover.launch --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
  • Links: Model Downloads, quick_start.py

Highlighted Details

  • Achieves 63.5% on miniF2F-test and 25.3% on ProofNet benchmarks.
  • Offers 7B parameter models: Base, SFT, and RL versions.
  • Integrates RL from proof assistant feedback and RMaxTS for enhanced proof search.
  • Supports commercial use via its model license.

Maintenance & Community

  • Primarily maintained by DeepSeek AI.
  • For questions and discussions, use GitHub Discussions.
  • Report bugs via GitHub Issues.

Licensing & Compatibility

  • Code repository: MIT License.
  • Model use: Subject to DeepSeekMath Model License.
  • Supports commercial use.

Limitations & Caveats

The model is specifically designed for Lean 4 theorem proving and may not generalize to other formal languages or domains without further adaptation. Performance claims are based on specific benchmark evaluations.

Health Check
Last commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
26 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.