rho  by microsoft

LLM pretraining research paper using selective language modeling (SLM)

created 1 year ago
428 stars

Top 70.3% on sourcepulse

GitHubView on GitHub
Project Summary

Rho-1 introduces Selective Language Modeling (SLM) for efficient LLM pre-training, targeting researchers and developers aiming to improve model performance on complex tasks like mathematics with significantly fewer training tokens. The project offers pre-trained models and evaluation scripts, demonstrating substantial gains in few-shot accuracy on benchmarks like GSM8k and MATH.

How It Works

SLM optimizes LLM pre-training by focusing on high-quality, "useful" tokens. The process involves training a reference model, scoring tokens in a corpus based on their loss relative to the reference, and then selectively training the target LLM only on tokens exhibiting higher "excess loss." This approach aims to filter out noisy or less informative tokens, leading to faster convergence and improved performance with reduced computational cost.

Quick Start & Requirements

  • Evaluation: Navigate to rho-1/math-evaluation-harness and run bash scripts/run_eval.sh cot microsoft/rho-math-7b-v0.1 for base model evaluation or bash scripts/run_eval.sh tora microsoft/rho-math-7b-interpreter-v0.1 for code interpreter model evaluation.
  • Dependencies: Requires Python and standard ML libraries. Specific versions are not detailed, but evaluation scripts suggest compatibility with Hugging Face models.
  • Resources: Pre-trained models are available on Hugging Face. Evaluation requires computational resources for running inference.

Highlighted Details

  • Rho-Math-1B and Rho-Math-7B models achieve competitive few-shot accuracy on MATH with up to 97% fewer pretraining tokens compared to baselines.
  • Rho-Math-1B-Interpreter is the first 1B LLM to exceed 40% accuracy on MATH.
  • Rho-Math-7B-Interpreter achieves 52% on MATH with only 69k fine-tuning samples.
  • The project provides evaluation scripts for both few-shot CoT and tool-integrated reasoning (Code Interpreter) scenarios.

Maintenance & Community

The project is associated with Microsoft Research. Contributions are welcomed, requiring agreement to a Contributor License Agreement (CLA).

Licensing & Compatibility

The code is released under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The README focuses on evaluation and pre-trained models; details on training the SLM from scratch or the specific data filtering mechanisms are not fully elaborated in the provided text.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
22 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.