onnxruntime-training-examples by microsoft

ORTModule examples for accelerated training of transformer models

Created 5 years ago

358 stars

Top 78.3% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeff Hammerbacher

Cofounder of Cloudera

Project Summary

This repository provides examples for accelerating PyTorch model training using ONNX Runtime (ORT). It targets researchers and engineers working with large transformer models, offering significant speedups and memory optimizations with minimal code changes.

How It Works

ORTModule integrates seamlessly with PyTorch, allowing users to switch to an optimized ORT backend with a single line of code. This approach leverages ONNX Runtime's optimized kernels and memory management techniques to achieve faster training times and enable the fitting of larger models onto available hardware. The extensible execution provider architecture also supports diverse hardware, including NVIDIA and AMD GPUs.

Quick Start & Requirements

Install via pip install torch-ort.
Requires PyTorch.
Supports NVIDIA and AMD GPUs.
Refer to official ONNX Runtime documentation for detailed setup: https://www.onnxruntime.ai/

Highlighted Details

Up to 1.4X training speedup.
Enables training of larger models (e.g., GPT-2 on 16GB GPU).
Composable with other acceleration libraries like DeepSpeed.
Examples cover popular HuggingFace transformer models (BART, BERT, DeBERTa, GPT2, RoBERTa) and T5.

Maintenance & Community

This project is maintained by Microsoft. Contributions are welcome, subject to a Contributor License Agreement (CLA).

Licensing & Compatibility

The repository itself is not explicitly licensed in the README. However, it demonstrates the use of torch-ort, which is part of the ONNX Runtime ecosystem. ONNX Runtime is typically available under permissive licenses like MIT, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

The examples focus primarily on transformer models and large-scale training scenarios. Performance gains and compatibility may vary for other model architectures or use cases.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days