LLM pretraining research paper using selective language modeling (SLM)
Top 70.3% on sourcepulse
Rho-1 introduces Selective Language Modeling (SLM) for efficient LLM pre-training, targeting researchers and developers aiming to improve model performance on complex tasks like mathematics with significantly fewer training tokens. The project offers pre-trained models and evaluation scripts, demonstrating substantial gains in few-shot accuracy on benchmarks like GSM8k and MATH.
How It Works
SLM optimizes LLM pre-training by focusing on high-quality, "useful" tokens. The process involves training a reference model, scoring tokens in a corpus based on their loss relative to the reference, and then selectively training the target LLM only on tokens exhibiting higher "excess loss." This approach aims to filter out noisy or less informative tokens, leading to faster convergence and improved performance with reduced computational cost.
Quick Start & Requirements
rho-1/math-evaluation-harness
and run bash scripts/run_eval.sh cot microsoft/rho-math-7b-v0.1
for base model evaluation or bash scripts/run_eval.sh tora microsoft/rho-math-7b-interpreter-v0.1
for code interpreter model evaluation.Highlighted Details
Maintenance & Community
The project is associated with Microsoft Research. Contributions are welcomed, requiring agreement to a Contributor License Agreement (CLA).
Licensing & Compatibility
The code is released under the MIT License, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
The README focuses on evaluation and pre-trained models; details on training the SLM from scratch or the specific data filtering mechanisms are not fully elaborated in the provided text.
1 year ago
Inactive