tutorials  by triton-inference-server

Tutorials for Triton Inference Server deployment

Created 2 years ago
795 stars

Top 44.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides tutorials and examples for the Triton Inference Server, targeting users migrating from traditional deep learning inference to a more streamlined "Tensor in & Tensor out" approach. It aims to familiarize users with Triton's features and ease their transition to efficient model deployment.

How It Works

The tutorials demonstrate deploying models from various frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM, OpenVINO) to Triton. They cover conceptual understanding of inference infrastructure challenges, framework-specific deployment methods, and feature-specific examples. A dedicated HuggingFace guide details various deployment strategies for HuggingFace models.

Quick Start & Requirements

Highlighted Details

  • Includes tutorials for popular LLMs like Llama-2-7B and Falcon-7B using TensorRT-LLM and HuggingFace Transformers.
  • Covers deployment for models trained with PyTorch, TensorFlow, ONNX, TensorRT, vLLM, and OpenVINO.
  • Features guides on building inference infrastructure, migrating existing solutions, and agentic workflows.
  • Points to related repositories for Triton Server, Client, Backends, Model Analyzer, and Model Navigator.

Maintenance & Community

  • Contributions are welcomed via pull requests.
  • Requests for new examples can be submitted via issues.

Licensing & Compatibility

  • The repository itself is not explicitly licensed in the README. The Triton Inference Server core is typically Apache 2.0 licensed, but this should be verified with the main Triton repository.

Limitations & Caveats

  • The list of supported LLMs in the tutorials is not exhaustive.
  • Examples assume a basic familiarity with Triton Inference Server; prior review of getting started materials is recommended.
Health Check
Last Commit

10 hours ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
0
Star History
14 stars in the last 30 days

Explore Similar Projects

Starred by Amanpreet Singh Amanpreet Singh(Cofounder of Contextual AI), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
7 more.

truss by basetenlabs

0.1%
1k
Model deployment tool for productionizing AI/ML models
Created 3 years ago
Updated 23 hours ago
Feedback? Help us improve.