onnxruntime-training-examples  by microsoft

ORTModule examples for accelerated training of transformer models

Created 5 years ago
345 stars

Top 80.2% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides examples for accelerating PyTorch model training using ONNX Runtime (ORT). It targets researchers and engineers working with large transformer models, offering significant speedups and memory optimizations with minimal code changes.

How It Works

ORTModule integrates seamlessly with PyTorch, allowing users to switch to an optimized ORT backend with a single line of code. This approach leverages ONNX Runtime's optimized kernels and memory management techniques to achieve faster training times and enable the fitting of larger models onto available hardware. The extensible execution provider architecture also supports diverse hardware, including NVIDIA and AMD GPUs.

Quick Start & Requirements

  • Install via pip install torch-ort.
  • Requires PyTorch.
  • Supports NVIDIA and AMD GPUs.
  • Refer to official ONNX Runtime documentation for detailed setup: https://www.onnxruntime.ai/

Highlighted Details

  • Up to 1.4X training speedup.
  • Enables training of larger models (e.g., GPT-2 on 16GB GPU).
  • Composable with other acceleration libraries like DeepSpeed.
  • Examples cover popular HuggingFace transformer models (BART, BERT, DeBERTa, GPT2, RoBERTa) and T5.

Maintenance & Community

This project is maintained by Microsoft. Contributions are welcome, subject to a Contributor License Agreement (CLA).

Licensing & Compatibility

The repository itself is not explicitly licensed in the README. However, it demonstrates the use of torch-ort, which is part of the ONNX Runtime ecosystem. ONNX Runtime is typically available under permissive licenses like MIT, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

The examples focus primarily on transformer models and large-scale training scenarios. Performance gains and compatibility may vary for other model architectures or use cases.

Health Check
Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Tri Dao Tri Dao(Chief Scientist at Together AI), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
1 more.

oslo by tunib-ai

0%
309
Framework for large-scale transformer optimization
Created 3 years ago
Updated 3 years ago
Starred by Clement Delangue Clement Delangue(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
20 more.

accelerate by huggingface

0.3%
9k
PyTorch training helper for distributed execution
Created 4 years ago
Updated 1 day ago
Feedback? Help us improve.