tutorials  by triton-inference-server

Tutorials for Triton Inference Server deployment

created 2 years ago
747 stars

Top 47.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides tutorials and examples for the Triton Inference Server, targeting users migrating from traditional deep learning inference to a more streamlined "Tensor in & Tensor out" approach. It aims to familiarize users with Triton's features and ease their transition to efficient model deployment.

How It Works

The tutorials demonstrate deploying models from various frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM, OpenVINO) to Triton. They cover conceptual understanding of inference infrastructure challenges, framework-specific deployment methods, and feature-specific examples. A dedicated HuggingFace guide details various deployment strategies for HuggingFace models.

Quick Start & Requirements

Highlighted Details

  • Includes tutorials for popular LLMs like Llama-2-7B and Falcon-7B using TensorRT-LLM and HuggingFace Transformers.
  • Covers deployment for models trained with PyTorch, TensorFlow, ONNX, TensorRT, vLLM, and OpenVINO.
  • Features guides on building inference infrastructure, migrating existing solutions, and agentic workflows.
  • Points to related repositories for Triton Server, Client, Backends, Model Analyzer, and Model Navigator.

Maintenance & Community

  • Contributions are welcomed via pull requests.
  • Requests for new examples can be submitted via issues.

Licensing & Compatibility

  • The repository itself is not explicitly licensed in the README. The Triton Inference Server core is typically Apache 2.0 licensed, but this should be verified with the main Triton repository.

Limitations & Caveats

  • The list of supported LLMs in the tutorials is not exhaustive.
  • Examples assume a basic familiarity with Triton Inference Server; prior review of getting started materials is recommended.
Health Check
Last commit

15 hours ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
0
Star History
60 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.