oslo by tunib-ai

Framework for large-scale transformer optimization

Created 4 years ago

309 stars

Top 87.1% on SourcePulse

View on GitHub

3 Experts Love This Project

Tri Dao

Chief Scientist at Together AI

Stas Bekman

Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake

Philipp Schmid

DevRel at Google DeepMind

Project Summary

OSLO is an open-source framework designed to simplify and accelerate the optimization of large-scale transformer models, primarily for researchers and engineers working with Hugging Face Transformers. It provides GPU-based optimization technologies like 3D parallelism and kernel fusion, making advanced techniques accessible for training models such as GPT-J.

How It Works

OSLO integrates state-of-the-art techniques for distributed training and performance enhancement. Its core features include 3D parallelism (tensor and pipeline parallelism) to distribute model computations across multiple GPUs, and kernel fusion to combine multiple GPU operations into single kernels, reducing overhead and increasing speed. It also offers seamless integration with DeepSpeed for ZeRO data parallelism and includes utilities for efficient data processing and model deployment.

Quick Start & Requirements

Install via pip: pip install oslo-core
Optional C++ extensions: CPP_AVAILABLE=1 pip install oslo-core (default is 1 on Linux, 0 on Windows).
Requires PyTorch and Hugging Face Transformers.
See USAGE.md for detailed instructions.

Highlighted Details

Supports 3D Parallelism (tensor and pipeline) for multi-GPU training.
Implements Kernel Fusion for increased training and inference speed.
Offers DeepSpeed support for ZeRO data parallelism.
Includes utilities for data processing and a deployment launcher.

Maintenance & Community

The project was released in December 2021 with version 1.0. No specific community channels or active development signals are present in the README.

Licensing & Compatibility

Licensed under the Apache License 2.0. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The README indicates support for GPT2, GPTNeo, and GPTJ architectures, with plans to support more, suggesting current support may be limited. The project's last update mentioned in the README was December 2021, raising potential concerns about current maintenance status and compatibility with newer libraries.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days