onnxruntime-training-examples  by microsoft

ORTModule examples for accelerated training of transformer models

created 5 years ago
339 stars

Top 82.4% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides examples for accelerating PyTorch model training using ONNX Runtime (ORT). It targets researchers and engineers working with large transformer models, offering significant speedups and memory optimizations with minimal code changes.

How It Works

ORTModule integrates seamlessly with PyTorch, allowing users to switch to an optimized ORT backend with a single line of code. This approach leverages ONNX Runtime's optimized kernels and memory management techniques to achieve faster training times and enable the fitting of larger models onto available hardware. The extensible execution provider architecture also supports diverse hardware, including NVIDIA and AMD GPUs.

Quick Start & Requirements

  • Install via pip install torch-ort.
  • Requires PyTorch.
  • Supports NVIDIA and AMD GPUs.
  • Refer to official ONNX Runtime documentation for detailed setup: https://www.onnxruntime.ai/

Highlighted Details

  • Up to 1.4X training speedup.
  • Enables training of larger models (e.g., GPT-2 on 16GB GPU).
  • Composable with other acceleration libraries like DeepSpeed.
  • Examples cover popular HuggingFace transformer models (BART, BERT, DeBERTa, GPT2, RoBERTa) and T5.

Maintenance & Community

This project is maintained by Microsoft. Contributions are welcome, subject to a Contributor License Agreement (CLA).

Licensing & Compatibility

The repository itself is not explicitly licensed in the README. However, it demonstrates the use of torch-ort, which is part of the ONNX Runtime ecosystem. ONNX Runtime is typically available under permissive licenses like MIT, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

The examples focus primarily on transformer models and large-scale training scenarios. Performance gains and compatibility may vary for other model architectures or use cases.

Health Check
Last commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Lewis Tunstall Lewis Tunstall(Researcher at Hugging Face), and
1 more.

fastformers by microsoft

0%
705
NLU optimization recipes for transformer models
created 5 years ago
updated 4 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
12 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 1 day ago
Feedback? Help us improve.