ort  by pytorch

PyTorch extension for ONNX Runtime model acceleration

created 4 years ago
364 stars

Top 78.4% on sourcepulse

GitHubView on GitHub
Project Summary

This library accelerates PyTorch model training and inference using ONNX Runtime and Intel® OpenVINO™. It targets PyTorch developers seeking to reduce training time, scale large models, and optimize inference performance, particularly on Intel hardware.

How It Works

The library provides ORTModule for training, which converts PyTorch models to ONNX format and leverages ONNX Runtime for accelerated execution. It also includes optimized optimizers like FusedAdam and FP16_Optimizer, and a LoadBalancingDistributedSampler for efficient data loading in distributed training. For inference, ORTInferenceModule enables ONNX Runtime with the OpenVINO™ Execution Provider, targeting Intel CPUs, GPUs, and VPUs with options for precision (FP32/FP16) and backend.

Quick Start & Requirements

Highlighted Details

  • Reduces PyTorch training time and GPU cost for large transformer models.
  • Supports FusedAdam, FP16_Optimizer, and LoadBalancingDistributedSampler for training efficiency.
  • Inference acceleration via OpenVINO™ Execution Provider on Intel hardware (CPU, GPU, VPU).
  • Offers Mixture of Experts (MoE) layer implementation, usable standalone or with ONNX Runtime.

Maintenance & Community

  • Actively maintained by PyTorch and ONNX Runtime teams.
  • CI checks for API stability.
  • Contribution guide available.

Licensing & Compatibility

  • MIT License.
  • Compatible with commercial and closed-source applications.

Limitations & Caveats

Training acceleration currently requires NVIDIA or AMD GPUs. Inference package has specific OS and Python version requirements.

Health Check
Last commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
6 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
2 more.

hummingbird by microsoft

0.0%
3k
Compiler for trained ML models into tensor computation
created 5 years ago
updated 2 weeks ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
12 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 1 day ago
Feedback? Help us improve.