math-lm  by EleutherAI

Open language model for mathematics research paper

created 2 years ago
1,083 stars

Top 35.7% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Llemma is an open-source language model specifically designed for mathematical tasks, targeting researchers and developers in AI and mathematics. It offers specialized capabilities for understanding and generating mathematical content, potentially accelerating research and development in areas requiring advanced mathematical reasoning.

How It Works

Llemma is built upon the GPT-NeoX architecture, a transformer-based language model. The project focuses on training these models on curated mathematical datasets, including Proof-Pile-2 and AlgebraicStack, to imbue them with strong mathematical reasoning abilities. This approach leverages large-scale data and a robust architecture to achieve specialized performance in mathematical domains.

Quick Start & Requirements

To use the models, clone the repository with git clone --recurse-submodules or run git submodule update --init --recursive after cloning. Access to the Llemma 7b and 34b models is available via Hugging Face Hub links. Further details on data preprocessing, fine-tuning, and evaluation scripts are provided within the repository.

Highlighted Details

  • Offers pre-trained models: Llemma 7b and Llemma 34b.
  • Includes datasets: Proof-Pile-2 and AlgebraicStack.
  • Provides code for fine-tuning and evaluation experiments.
  • Integrates with EleutherAI's LM Evaluation Harness for benchmarking.

Maintenance & Community

This project is from EleutherAI, a prominent research collective focused on open-source AI. Further community engagement and project updates can typically be found through EleutherAI's official channels.

Licensing & Compatibility

The specific license for the Llemma models and associated code is not explicitly stated in the provided README. However, EleutherAI projects often utilize permissive licenses like Apache 2.0 or MIT, but this should be verified for commercial use or closed-source integration.

Limitations & Caveats

The README does not detail specific performance benchmarks or limitations of the Llemma models. The project is presented as a repository for data and training code, with the models themselves hosted separately on Hugging Face Hub.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.