TensorRT  by pytorch

PyTorch compiler for NVIDIA GPUs using TensorRT

Created 5 years ago
2,859 stars

Top 16.7% on SourcePulse

GitHubView on GitHub
Project Summary

Torch-TensorRT enables significant inference acceleration for PyTorch models on NVIDIA GPUs. It targets PyTorch users, researchers, and developers seeking to optimize model performance, offering up to 5x latency reduction with minimal code changes.

How It Works

Torch-TensorRT integrates with PyTorch's compilation pipeline (torch.compile) and export workflows. It leverages TensorRT, NVIDIA's SDK for high-performance deep learning inference, to optimize PyTorch models. The compiler converts PyTorch operations into TensorRT engines, fusing layers and applying hardware-specific optimizations for NVIDIA GPUs. This approach allows for seamless integration and substantial performance gains without requiring manual model rewriting.

Quick Start & Requirements

  • Install: pip install torch-tensorrt (stable) or pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu128 (nightly). Also available in NVIDIA NGC PyTorch Container.
  • Prerequisites: PyTorch (tested with 2.8 nightly), CUDA 12.8, TensorRT 10.9.0.43.
  • Platform Support: Linux AMD64/GPU, Windows/GPU (Dynamo only), Linux aarch64/GPU (JetPack-4.4+). ppc64le/GPU not supported.
  • Docs: https://nvidia.github.io/Torch-TensorRT/

Highlighted Details

  • Achieve up to 5x faster inference latency compared to eager execution.
  • Supports torch.compile backend for one-line optimization.
  • Provides an export workflow for C++ deployment via TorchScript.
  • Includes tools for resolving graph breaks and boosting performance.

Maintenance & Community

The project is actively maintained by NVIDIA. Details on contributing can be found in CONTRIBUTING.md.

Licensing & Compatibility

Licensed under BSD-3-Clause, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Windows support is limited to the Dynamo backend. Older ARM64 platforms (JetPack < 4.4) require version 1.0.0. Support for specific TensorRT and PyTorch versions is tested, and compatibility with other versions is not guaranteed.

Health Check
Last Commit

17 hours ago

Responsiveness

1 week

Pull Requests (30d)
42
Issues (30d)
21
Star History
25 stars in the last 30 days

Explore Similar Projects

Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI), and
7 more.

TransformerEngine by NVIDIA

0.4%
3k
Library for Transformer model acceleration on NVIDIA GPUs
Created 3 years ago
Updated 19 hours ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
15 more.

FasterTransformer by NVIDIA

0.1%
6k
Optimized transformer library for inference
Created 4 years ago
Updated 1 year ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
20 more.

TensorRT-LLM by NVIDIA

0.5%
12k
LLM inference optimization SDK for NVIDIA GPUs
Created 2 years ago
Updated 12 hours ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Alexey Milovidov Alexey Milovidov(Cofounder of Clickhouse), and
29 more.

llm.c by karpathy

0.2%
28k
LLM training in pure C/CUDA, no PyTorch needed
Created 1 year ago
Updated 2 months ago
Feedback? Help us improve.