Language model research repo for medium-sized models (up to 7B params)
Top 62.4% on sourcepulse
OpenLM is a PyTorch-based language modeling repository designed for efficient research on medium-sized models (up to 7B parameters). It offers a minimal dependency set, focusing on PyTorch, XFormers, and Triton, making it accessible for researchers and practitioners looking to train or fine-tune LMs without the overhead of larger frameworks.
How It Works
OpenLM utilizes a modular design centered around PyTorch, allowing for flexible integration of performance-enhancing libraries like XFormers and Triton. The training pipeline supports distributed computation via torchrun
and handles data preprocessing and loading through the webdataset
package. This approach prioritizes core LM functionality and performance, enabling researchers to experiment with various model sizes and training configurations efficiently.
Quick Start & Requirements
pip install -r requirements.txt
followed by pip install --editable .
wiki_download.py
, make_2048.py
).torchrun
for distributed training. Example command provided in README.llm-foundry
(pip install llm-foundry
).Highlighted Details
Maintenance & Community
Developed by researchers from multiple institutions including RAIVN Lab (University of Washington), UWNLP, Toyota Research Institute, and Columbia University. Code is based on open-clip and open-flamingo. Stability.ai provided resource support.
Licensing & Compatibility
The repository does not explicitly state a license in the README. This requires clarification for commercial use or closed-source linking.
Limitations & Caveats
The README notes that the OpenLM-7B model is still in training, with a checkpoint released at 1.25T tokens. There's a specific note regarding positional embedding types for the pretrained OpenLM-1B model, requiring head_rotary
for compatibility with older training configurations.
1 month ago
1 day