oslo  by tunib-ai

Framework for large-scale transformer optimization

Created 3 years ago
309 stars

Top 86.9% on SourcePulse

GitHubView on GitHub
Project Summary

OSLO is an open-source framework designed to simplify and accelerate the optimization of large-scale transformer models, primarily for researchers and engineers working with Hugging Face Transformers. It provides GPU-based optimization technologies like 3D parallelism and kernel fusion, making advanced techniques accessible for training models such as GPT-J.

How It Works

OSLO integrates state-of-the-art techniques for distributed training and performance enhancement. Its core features include 3D parallelism (tensor and pipeline parallelism) to distribute model computations across multiple GPUs, and kernel fusion to combine multiple GPU operations into single kernels, reducing overhead and increasing speed. It also offers seamless integration with DeepSpeed for ZeRO data parallelism and includes utilities for efficient data processing and model deployment.

Quick Start & Requirements

  • Install via pip: pip install oslo-core
  • Optional C++ extensions: CPP_AVAILABLE=1 pip install oslo-core (default is 1 on Linux, 0 on Windows).
  • Requires PyTorch and Hugging Face Transformers.
  • See USAGE.md for detailed instructions.

Highlighted Details

  • Supports 3D Parallelism (tensor and pipeline) for multi-GPU training.
  • Implements Kernel Fusion for increased training and inference speed.
  • Offers DeepSpeed support for ZeRO data parallelism.
  • Includes utilities for data processing and a deployment launcher.

Maintenance & Community

The project was released in December 2021 with version 1.0. No specific community channels or active development signals are present in the README.

Licensing & Compatibility

Licensed under the Apache License 2.0. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The README indicates support for GPT2, GPTNeo, and GPTJ architectures, with plans to support more, suggesting current support may be limited. The project's last update mentioned in the README was December 2021, raising potential concerns about current maintenance status and compatibility with newer libraries.

Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Amanpreet Singh Amanpreet Singh(Cofounder of Contextual AI) and Ross Taylor Ross Taylor(Cofounder of General Reasoning; Cocreator of Papers with Code).

torchshard by kaiyuyue

0%
300
PyTorch engine for tensor slicing into parallel shards
Created 4 years ago
Updated 3 months ago
Starred by Luca Soldaini Luca Soldaini(Research Scientist at Ai2), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
4 more.

parallelformers by tunib-ai

0%
790
Toolkit for easy model parallelization
Created 4 years ago
Updated 2 years ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
26 more.

ColossalAI by hpcaitech

0.1%
41k
AI system for large-scale parallel training
Created 3 years ago
Updated 13 hours ago
Feedback? Help us improve.