Kimina-Prover-Preview  by MoonshotAI

Research paper for formal reasoning model in Lean 4

created 3 months ago
320 stars

Top 86.0% on sourcepulse

GitHubView on GitHub
Project Summary

Kimina-Prover Preview introduces a novel approach to formal mathematical reasoning, targeting researchers and developers in automated theorem proving and formal verification. It achieves state-of-the-art performance on benchmarks like miniF2F by leveraging large language models trained with reinforcement learning, enabling human-like reasoning and rigorous theorem proving in Lean 4.

How It Works

The model employs a whole-proof generation strategy enhanced by reinforcement learning, eschewing complex techniques like Monte Carlo tree search. This approach, combined with significant model scaling (72B parameters) and an extended context window (32K tokens), allows for efficient learning and robust performance. A key innovation is the "Formal Reasoning Pattern," a designed reasoning style that bridges informal mathematical intuition with formal verification.

Quick Start & Requirements

  • Install: Distilled models and autoformalization model are available on Hugging Face. Kimina Lean Server is also released.
  • Prerequisites: Lean 4, specific Python versions, and potentially GPU resources for running the models.
  • Resources: Details on specific hardware requirements for running the 72B model are not explicitly stated but are implied to be substantial.
  • Links: HuggingFace, Demo, Kimina Lean Server, CombiBench

Highlighted Details

  • Achieves 80%+ pass rate on miniF2F, outperforming prior works by a significant margin.
  • Demonstrates strong performance with high sample efficiency (e.g., 68.85% pass@32).
  • Identifies and releases corrections for at least 5 problems in the miniF2F-test dataset.
  • Utilizes a 32K token context window, the longest in neural theorem proving.

Maintenance & Community

The project is associated with the Numina & Kimi Team. Further community engagement channels are not explicitly listed in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The project is released as a "Preview," suggesting potential for ongoing development and changes. Specific hardware requirements for optimal performance, particularly for the larger models, are not fully detailed.

Health Check
Last commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
5
Star History
50 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Woosuk Kwon Woosuk Kwon(Author of vLLM), and
11 more.

WizardLM by nlpxucan

0.1%
9k
LLMs built using Evol-Instruct for complex instruction following
created 2 years ago
updated 1 month ago
Feedback? Help us improve.