OLMo-core by allenai

PyTorch building blocks for large language model training and inference

Created 1 year ago

468 stars

Top 65.0% on SourcePulse

View on GitHub

2 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Luca Soldaini

Research Scientist at Ai2

Project Summary

OLMo-core provides PyTorch building blocks for the OLMo ecosystem, enabling efficient modeling and training of large language models. It is designed for researchers and developers working with state-of-the-art NLP models, offering optimized components and training scripts for the OLMo-2 family of models.

How It Works

OLMo-core leverages PyTorch for its core operations, integrating specialized kernels and libraries for performance. It supports advanced techniques like flash-attention for context parallelism and fused-linear loss implementations for reduced memory usage. The architecture is designed to facilitate large-scale distributed training, with official scripts provided for reproducible model training and fine-tuning.

Quick Start & Requirements

Installation: pip install ai2-olmo-core or pip install -e .[all] for development.
Prerequisites: PyTorch (OS/hardware specific installation), optional dependencies like flash-attn, ring-flash-attn, Liger-Kernel, torchao, and grouped_gemm for advanced features. Docker images are available.
Resources: Training scripts are designed for multi-GPU setups (e.g., H100 clusters). Inference can be performed on single GPUs.
Docs: https://github.com/allenai/OLMo-core

Highlighted Details

Supports OLMo-2 32B model training and inference.
Includes official training scripts for two-stage pretraining procedures.
Offers Hugging Face integration for seamless model loading and generation.
Provides quantization support via bitsandbytes.

Maintenance & Community

The project is developed by Allen Institute for AI (AI2). Further details on community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Some optional dependencies may require compilation from source until a specific PR is merged. Docker images do not include the OLMo-core package itself, requiring manual installation. Compatibility of Docker images may vary based on hardware and CUDA versions. Older OLMo models (7B, 13B) use a different trainer codebase, and their configs/scripts are not compatible with this repository.

Health Check

Last Commit

22 hours ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

156 stars in the last 30 days