abel  by GAIR-NLP

SOTA LLM for math problem solving

created 1 year ago
333 stars

Top 83.6% on sourcepulse

GitHubView on GitHub
Project Summary

Abel is an open-source Large Language Model (LLM) focused on achieving state-of-the-art performance in mathematical reasoning without relying on external tools, reward models, or RLHF. It targets researchers and developers working on AI for STEM education and complex problem-solving, offering significant improvements over existing models on benchmarks like GSM8K and MATH.

How It Works

Abel is trained using a novel Supervised Fine-Tuning (SFT) methodology called "Parental Oversight." This approach emphasizes data processing philosophy, treating fine-tuning data like educational methods for children. It prioritizes data quality, relevance, and the inclusion of step-by-step reasoning, aiming to instill a deeper understanding rather than just memorization. This SFT-centric approach is presented as a significantly underestimated method for achieving high performance in complex reasoning tasks.

Quick Start & Requirements

  • Install: Create a conda environment (conda create -n abel python=3.10), activate it (conda activate abel), and install dependencies (pip install -r requirements.txt).
  • Evaluation: Run bash evaluation/eval.sh.
  • Prerequisites: Python 3.10, conda.
  • Resources: Evaluation may involve vLLM, with potential for slight result variations.
  • Links: Model and Leaderboard, Evaluation

Highlighted Details

  • Abel-7B-002 achieves 80.44 on GSM8K and 29.46 on MATH, outperforming other 7B models.
  • The 70B model reaches 83.62 on GSM8K and 28.26 on MATH, surpassing many proprietary models without tools.
  • Demonstrates strong robustness against out-of-distribution samples on GSM8k_robust dataset.
  • Achieves SOTA on the TAL-SCQ5K-EN dataset, outperforming MathGPT and GPT-4.

Maintenance & Community

  • Developed by GAIR Lab at Shanghai Jiao Tong University, Shanghai AI Lab.
  • Actively refining models with planned updates.
  • Issues list maintained for limitations and potential solutions.

Licensing & Compatibility

  • Abel-7B-002 is licensed under Apache License 2.0.
  • Abel-7B-001 and Abel-13B-001 are licensed under Llama 2.
  • Apache 2.0 is permissive for commercial use and closed-source linking. Llama 2 license has restrictions.

Limitations & Caveats

The model's generalization capabilities are limited to specific mathematical domains, lacking broad applicability to diverse problem types or integration into multi-domain chatbots. Multilingual support is absent, and advanced techniques like reward models and RLHF have not been explored.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Woosuk Kwon Woosuk Kwon(Author of vLLM), and
11 more.

WizardLM by nlpxucan

0.1%
9k
LLMs built using Evol-Instruct for complex instruction following
created 2 years ago
updated 1 month ago
Feedback? Help us improve.