Discover and explore top open-source AI tools and projects—updated daily.
PyTorch building blocks for large language model training and inference
Top 93.3% on SourcePulse
OLMo-core provides PyTorch building blocks for the OLMo ecosystem, enabling efficient modeling and training of large language models. It is designed for researchers and developers working with state-of-the-art NLP models, offering optimized components and training scripts for the OLMo-2 family of models.
How It Works
OLMo-core leverages PyTorch for its core operations, integrating specialized kernels and libraries for performance. It supports advanced techniques like flash-attention for context parallelism and fused-linear loss implementations for reduced memory usage. The architecture is designed to facilitate large-scale distributed training, with official scripts provided for reproducible model training and fine-tuning.
Quick Start & Requirements
pip install ai2-olmo-core
or pip install -e .[all]
for development.flash-attn
, ring-flash-attn
, Liger-Kernel
, torchao
, and grouped_gemm
for advanced features. Docker images are available.Highlighted Details
bitsandbytes
.Maintenance & Community
The project is developed by Allen Institute for AI (AI2). Further details on community channels or roadmaps are not explicitly provided in the README.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
Some optional dependencies may require compilation from source until a specific PR is merged. Docker images do not include the OLMo-core package itself, requiring manual installation. Compatibility of Docker images may vary based on hardware and CUDA versions. Older OLMo models (7B, 13B) use a different trainer codebase, and their configs/scripts are not compatible with this repository.
5 hours ago
Inactive