OLMo-core  by allenai

PyTorch building blocks for large language model training and inference

Created 2 years ago
950 stars

Top 38.5% on SourcePulse

GitHubView on GitHub
Project Summary

OLMo-core provides PyTorch building blocks for the OLMo ecosystem, enabling efficient modeling and training of large language models. It is designed for researchers and developers working with state-of-the-art NLP models, offering optimized components and training scripts for the OLMo-2 family of models.

How It Works

OLMo-core leverages PyTorch for its core operations, integrating specialized kernels and libraries for performance. It supports advanced techniques like flash-attention for context parallelism and fused-linear loss implementations for reduced memory usage. The architecture is designed to facilitate large-scale distributed training, with official scripts provided for reproducible model training and fine-tuning.

Quick Start & Requirements

  • Installation: pip install ai2-olmo-core or pip install -e .[all] for development.
  • Prerequisites: PyTorch (OS/hardware specific installation), optional dependencies like flash-attn, ring-flash-attn, Liger-Kernel, torchao, and grouped_gemm for advanced features. Docker images are available.
  • Resources: Training scripts are designed for multi-GPU setups (e.g., H100 clusters). Inference can be performed on single GPUs.
  • Docs: https://github.com/allenai/OLMo-core

Highlighted Details

  • Supports OLMo-2 32B model training and inference.
  • Includes official training scripts for two-stage pretraining procedures.
  • Offers Hugging Face integration for seamless model loading and generation.
  • Provides quantization support via bitsandbytes.

Maintenance & Community

The project is developed by Allen Institute for AI (AI2). Further details on community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Some optional dependencies may require compilation from source until a specific PR is merged. Docker images do not include the OLMo-core package itself, requiring manual installation. Compatibility of Docker images may vary based on hardware and CUDA versions. Older OLMo models (7B, 13B) use a different trainer codebase, and their configs/scripts are not compatible with this repository.

Health Check
Last Commit

23 hours ago

Responsiveness

Inactive

Pull Requests (30d)
30
Issues (30d)
7
Star History
163 stars in the last 30 days

Explore Similar Projects

Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

fms-fsdp by foundation-model-stack

0.7%
282
Efficiently train foundation models with PyTorch
Created 2 years ago
Updated 3 months ago
Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
7 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
Created 1 year ago
Updated 7 months ago
Feedback? Help us improve.