PyTorch extension for ONNX Runtime model acceleration
Top 78.4% on sourcepulse
This library accelerates PyTorch model training and inference using ONNX Runtime and Intel® OpenVINO™. It targets PyTorch developers seeking to reduce training time, scale large models, and optimize inference performance, particularly on Intel hardware.
How It Works
The library provides ORTModule
for training, which converts PyTorch models to ONNX format and leverages ONNX Runtime for accelerated execution. It also includes optimized optimizers like FusedAdam
and FP16_Optimizer
, and a LoadBalancingDistributedSampler
for efficient data loading in distributed training. For inference, ORTInferenceModule
enables ONNX Runtime with the OpenVINO™ Execution Provider, targeting Intel CPUs, GPUs, and VPUs with options for precision (FP32/FP16) and backend.
Quick Start & Requirements
pip install torch-ort
followed by python -m torch_ort.configure
. Requires NVIDIA or AMD GPU.pip install torch-ort-infer[openvino]
. Requires Ubuntu 18.04/20.04, Python 3.7-3.9.Highlighted Details
FusedAdam
, FP16_Optimizer
, and LoadBalancingDistributedSampler
for training efficiency.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
Training acceleration currently requires NVIDIA or AMD GPUs. Inference package has specific OS and Python version requirements.
5 months ago
Inactive