TensorRT_Tutorial  by LitLeo

TensorRT tutorials and resources

Created 8 years ago
1,036 stars

Top 36.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive set of Chinese translated documentation and tutorials for NVIDIA's TensorRT, a high-performance deep learning inference optimizer and runtime. It aims to help users, particularly those new to TensorRT or facing documentation challenges, understand and effectively utilize TensorRT for accelerating deep learning models, with a focus on INT8 quantization and custom plugin development.

How It Works

The project offers translated versions of the TensorRT User Guide and detailed explanations of TensorRT samples. It also includes practical usage experiences and blog posts covering topics like INT8 quantization, FP16 precision, custom layer implementation, and model conversion strategies. The content is structured to guide users from basic TensorRT usage to advanced techniques like creating custom plugins and optimizing performance.

Quick Start & Requirements

  • Installation: No direct installation command is provided as this is a documentation and tutorial repository. Users will need to install TensorRT separately from NVIDIA's developer site.
  • Prerequisites: NVIDIA GPU, CUDA Toolkit, and TensorRT itself. Familiarity with C++ and deep learning concepts is beneficial.
  • Resources: Links to TensorRT download pages, official documentation, and GTC presentations are provided.

Highlighted Details

  • Detailed explanations and translations for TensorRT versions up to 8.5.3.
  • Focus on INT8 quantization, FP16 precision, and custom plugin development.
  • Includes practical examples and blog posts on optimizing inference.
  • Covers various model conversion methods and acceleration strategies.

Maintenance & Community

The project was initiated in 2017 and has seen updates, including translations for TensorRT 8.5.3 in late 2023. A QQ group (483063470) is available for community interaction. The repository also mentions recruitment for AI heterogeneous acceleration internships at Tencent Beijing AILAB.

Licensing & Compatibility

The repository itself does not specify a license. The content is primarily for educational and informational purposes, translating and explaining NVIDIA's TensorRT, which is subject to NVIDIA's own licensing terms.

Limitations & Caveats

Some older tutorial chapters (1-2) suggest referring to the latest video versions, implying potential updates or improvements in those specific areas. The project is a community effort focused on translation and explanation, not a direct software package to be installed.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
15 more.

ThunderKittens by HazyResearch

0.6%
3k
CUDA kernel framework for fast deep learning primitives
Created 1 year ago
Updated 3 days ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
15 more.

FasterTransformer by NVIDIA

0.1%
6k
Optimized transformer library for inference
Created 4 years ago
Updated 1 year ago
Starred by Bojan Tunguz Bojan Tunguz(AI Scientist; Formerly at NVIDIA), Alex Chen Alex Chen(Cofounder of Nexa AI), and
19 more.

ggml by ggml-org

0.3%
13k
Tensor library for machine learning
Created 3 years ago
Updated 2 days ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
20 more.

TensorRT-LLM by NVIDIA

0.5%
12k
LLM inference optimization SDK for NVIDIA GPUs
Created 2 years ago
Updated 15 hours ago
Feedback? Help us improve.