TensorRT  by NVIDIA

SDK for accelerated deep learning inference on NVIDIA GPUs

Created 7 years ago
13,057 stars

Top 4.0% on SourcePulse

GitHubView on GitHub
Project Summary

NVIDIA TensorRT is an SDK designed for high-performance deep learning inference on NVIDIA GPUs, providing optimized runtimes and tools. This repository hosts the open-source components of TensorRT, including plugins and parsers, enabling developers to accelerate AI inference workflows. The latest release, TensorRT 11.0, focuses on API streamlining and introduces significant changes, such as the removal of legacy features and the adoption of strongly typed networks and explicit quantization.

How It Works

TensorRT optimizes deep learning models for inference by applying techniques like layer and tensor fusion, kernel auto-tuning, and dynamic precision calibration. The open-source components facilitate model ingestion via parsers (e.g., ONNX) and allow for custom operations through a plugin system. This approach targets NVIDIA hardware to maximize throughput and minimize latency for deployed AI models.

Quick Start & Requirements

  • Primary Install: pip install tensorrt
  • Build Prerequisites:
    • TensorRT GA build (v11.0.0.114 recommended)
    • CUDA (versions 13.2.0 or 12.9.0 recommended)
    • cuDNN (optional, v8.9 recommended)
    • GNU make (>= v4.1)
    • cmake (>= v3.31)
    • Python (>= v3.10, <= v3.13.x)
    • pip (>= v19.0)
    • git, pkg-config, wget
    • Optional: NCCL (>= v2.19, < v3.0) for multi-device support.
    • Containerized builds require Docker (>= 19.03) and NVIDIA Container Toolkit.
  • Links:
    • Import Workflows Guide: [See README]
    • Supported Models: [See README]
    • Contribution Guide: [See README]
    • Changelog: [See README]

Highlighted Details

  • TensorRT 11.0 released with new capabilities for AI inference acceleration.
  • API streamlined with removal of weakly-typed networks, implicit quantization, and IPluginV2.
  • Introduces Strongly Typed Networks and Explicit Quantization for improved control.
  • Supports various import paths including ONNX, Torch-TensorRT, and HuggingFace/Optimum.

Maintenance & Community

The project provides a Contribution Guide and Coding Guidelines for code contributions. Updates are detailed in the Changelog. Community engagement is encouraged via TensorRT and Triton community channels. Enterprise support is available through NVIDIA AI Enterprise.

Licensing & Compatibility

The README does not explicitly state the license type for the open-source components. Compatibility for commercial use or closed-source linking would require clarification on the licensing terms.

Limitations & Caveats

TensorRT 11.0 removes support for weakly-typed networks, implicit quantization, and IPluginV2 APIs, requiring migration to newer paradigms. Python bindings for versions older than 3.9 have been removed, and RPM packages now depend on Python 3.12. The TREX tool has been replaced by Nsight Deep Learning Designer.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
5
Issues (30d)
7
Star History
95 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.