DeepSeek-Prover-V1.5  by deepseek-ai

Theorem prover for formal mathematics using Lean 4

Created 1 year ago
537 stars

Top 59.2% on SourcePulse

GitHubView on GitHub
Project Summary

DeepSeek-Prover-V1.5 is an open-source language model for theorem proving in Lean 4, designed to improve upon its predecessor by optimizing training and inference. It targets researchers and practitioners in formal mathematics and AI, offering state-of-the-art performance on benchmarks like miniF2F and ProofNet.

How It Works

The model is pre-trained on formal mathematical languages and fine-tuned with supervised learning and reinforcement learning from proof assistant feedback (RLPAF). A key innovation is RMaxTS, a Monte-Carlo tree search variant employing intrinsic-reward-driven exploration for diverse proof path generation, moving beyond single-pass proof generation.

Quick Start & Requirements

  • Install: Clone repo, install dependencies (pip install -r requirements.txt), set up Lean 4, build Mathlib4.
  • Prerequisites: Linux, Python 3.10, Lean 4.
  • Inference: Use Huggingface Transformers.
  • Experiments: python -m prover.launch --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
  • Links: Model Downloads, quick_start.py

Highlighted Details

  • Achieves 63.5% on miniF2F-test and 25.3% on ProofNet benchmarks.
  • Offers 7B parameter models: Base, SFT, and RL versions.
  • Integrates RL from proof assistant feedback and RMaxTS for enhanced proof search.
  • Supports commercial use via its model license.

Maintenance & Community

  • Primarily maintained by DeepSeek AI.
  • For questions and discussions, use GitHub Discussions.
  • Report bugs via GitHub Issues.

Licensing & Compatibility

  • Code repository: MIT License.
  • Model use: Subject to DeepSeekMath Model License.
  • Supports commercial use.

Limitations & Caveats

The model is specifically designed for Lean 4 theorem proving and may not generalize to other formal languages or domains without further adaptation. Performance claims are based on specific benchmark evaluations.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Dan Abramov Dan Abramov(Core Contributor to React; Coauthor of Redux, Create React App) and Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab).

LeanDojo by lean-dojo

0%
705
Machine learning for theorem proving in Lean
Created 2 years ago
Updated 5 days ago
Feedback? Help us improve.