TensorRT  by pytorch

PyTorch compiler for NVIDIA GPUs using TensorRT

created 5 years ago
2,818 stars

Top 17.3% on sourcepulse

GitHubView on GitHub
Project Summary

Torch-TensorRT enables significant inference acceleration for PyTorch models on NVIDIA GPUs. It targets PyTorch users, researchers, and developers seeking to optimize model performance, offering up to 5x latency reduction with minimal code changes.

How It Works

Torch-TensorRT integrates with PyTorch's compilation pipeline (torch.compile) and export workflows. It leverages TensorRT, NVIDIA's SDK for high-performance deep learning inference, to optimize PyTorch models. The compiler converts PyTorch operations into TensorRT engines, fusing layers and applying hardware-specific optimizations for NVIDIA GPUs. This approach allows for seamless integration and substantial performance gains without requiring manual model rewriting.

Quick Start & Requirements

  • Install: pip install torch-tensorrt (stable) or pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu128 (nightly). Also available in NVIDIA NGC PyTorch Container.
  • Prerequisites: PyTorch (tested with 2.8 nightly), CUDA 12.8, TensorRT 10.9.0.43.
  • Platform Support: Linux AMD64/GPU, Windows/GPU (Dynamo only), Linux aarch64/GPU (JetPack-4.4+). ppc64le/GPU not supported.
  • Docs: https://nvidia.github.io/Torch-TensorRT/

Highlighted Details

  • Achieve up to 5x faster inference latency compared to eager execution.
  • Supports torch.compile backend for one-line optimization.
  • Provides an export workflow for C++ deployment via TorchScript.
  • Includes tools for resolving graph breaks and boosting performance.

Maintenance & Community

The project is actively maintained by NVIDIA. Details on contributing can be found in CONTRIBUTING.md.

Licensing & Compatibility

Licensed under BSD-3-Clause, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Windows support is limited to the Dynamo backend. Older ARM64 platforms (JetPack < 4.4) require version 1.0.0. Support for specific TensorRT and PyTorch versions is tested, and compatibility with other versions is not guaranteed.

Health Check
Last commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
60
Issues (30d)
34
Star History
86 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Zhuohan Li Zhuohan Li(Author of vLLM), and
6 more.

torchtitan by pytorch

0.9%
4k
PyTorch platform for generative AI model training research
created 1 year ago
updated 23 hours ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
6 more.

FasterTransformer by NVIDIA

0.2%
6k
Optimized transformer library for inference
created 4 years ago
updated 1 year ago
Feedback? Help us improve.