OLMo-core  by allenai

PyTorch building blocks for large language model training and inference

Created 1 year ago
278 stars

Top 93.3% on SourcePulse

GitHubView on GitHub
Project Summary

OLMo-core provides PyTorch building blocks for the OLMo ecosystem, enabling efficient modeling and training of large language models. It is designed for researchers and developers working with state-of-the-art NLP models, offering optimized components and training scripts for the OLMo-2 family of models.

How It Works

OLMo-core leverages PyTorch for its core operations, integrating specialized kernels and libraries for performance. It supports advanced techniques like flash-attention for context parallelism and fused-linear loss implementations for reduced memory usage. The architecture is designed to facilitate large-scale distributed training, with official scripts provided for reproducible model training and fine-tuning.

Quick Start & Requirements

  • Installation: pip install ai2-olmo-core or pip install -e .[all] for development.
  • Prerequisites: PyTorch (OS/hardware specific installation), optional dependencies like flash-attn, ring-flash-attn, Liger-Kernel, torchao, and grouped_gemm for advanced features. Docker images are available.
  • Resources: Training scripts are designed for multi-GPU setups (e.g., H100 clusters). Inference can be performed on single GPUs.
  • Docs: https://github.com/allenai/OLMo-core

Highlighted Details

  • Supports OLMo-2 32B model training and inference.
  • Includes official training scripts for two-stage pretraining procedures.
  • Offers Hugging Face integration for seamless model loading and generation.
  • Provides quantization support via bitsandbytes.

Maintenance & Community

The project is developed by Allen Institute for AI (AI2). Further details on community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Some optional dependencies may require compilation from source until a specific PR is merged. Docker images do not include the OLMo-core package itself, requiring manual installation. Compatibility of Docker images may vary based on hardware and CUDA versions. Older OLMo models (7B, 13B) use a different trainer codebase, and their configs/scripts are not compatible with this repository.

Health Check
Last Commit

5 hours ago

Responsiveness

Inactive

Pull Requests (30d)
29
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

fms-fsdp by foundation-model-stack

0.4%
262
Efficiently train foundation models with PyTorch
Created 1 year ago
Updated 1 month ago
Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
6 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
Created 10 months ago
Updated 1 month ago
Feedback? Help us improve.