PyTorch compiler for NVIDIA GPUs using TensorRT
Top 17.3% on sourcepulse
Torch-TensorRT enables significant inference acceleration for PyTorch models on NVIDIA GPUs. It targets PyTorch users, researchers, and developers seeking to optimize model performance, offering up to 5x latency reduction with minimal code changes.
How It Works
Torch-TensorRT integrates with PyTorch's compilation pipeline (torch.compile
) and export workflows. It leverages TensorRT, NVIDIA's SDK for high-performance deep learning inference, to optimize PyTorch models. The compiler converts PyTorch operations into TensorRT engines, fusing layers and applying hardware-specific optimizations for NVIDIA GPUs. This approach allows for seamless integration and substantial performance gains without requiring manual model rewriting.
Quick Start & Requirements
pip install torch-tensorrt
(stable) or pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu128
(nightly). Also available in NVIDIA NGC PyTorch Container.Highlighted Details
torch.compile
backend for one-line optimization.Maintenance & Community
The project is actively maintained by NVIDIA. Details on contributing can be found in CONTRIBUTING.md
.
Licensing & Compatibility
Licensed under BSD-3-Clause, permitting commercial use and integration with closed-source applications.
Limitations & Caveats
Windows support is limited to the Dynamo backend. Older ARM64 platforms (JetPack < 4.4) require version 1.0.0. Support for specific TensorRT and PyTorch versions is tested, and compatibility with other versions is not guaranteed.
1 day ago
1 week