backend  by triton-inference-server

Triton backend tools for model execution

Created 5 years ago
347 stars

Top 80.0% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides common source, scripts, and utilities for developing custom backends for the Triton Inference Server. It targets developers building custom inference logic or integrating new frameworks with Triton, enabling efficient model execution and pre/post-processing.

How It Works

Backends are implemented as shared libraries adhering to the Triton Backend API, which defines interfaces for managing backend, model, and instance lifecycles, as well as handling inference requests and responses. This API allows backends to interact with Triton for request processing, tensor data access, and response generation, supporting both single and decoupled response patterns.

Quick Start & Requirements

  • Build: mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX:PATH=$(pwd)/install .. && make install
  • Dependencies: Requires Triton's common and core repositories. Specific tags can be set via CMake arguments (e.g., -DTRITON_COMMON_REPO_TAG=[tag]).
  • Integration: The utilities are typically included in a backend's build via CMakeLists.txt, rather than building this repository directly.

Highlighted Details

  • Supports a wide range of existing backends including TensorRT, ONNX Runtime, TensorFlow, PyTorch, OpenVINO, DALI, FIL, TensorRT-LLM, and vLLM.
  • Provides a Python backend option for custom Python-based pre/post-processing or direct execution of Python scripts.
  • The Triton Backend API is C-based, offering fine-grained control over model execution and request handling.
  • Supports decoupled responses, allowing for out-of-order and multiple responses per request.

Maintenance & Community

  • This repository is part of the Triton Inference Server project. General questions can be directed to the main Triton issues page.

Licensing & Compatibility

  • The repository's license is not explicitly stated in the README. However, Triton Inference Server itself is typically Apache 2.0 licensed, suggesting potential compatibility with commercial and closed-source applications.

Limitations & Caveats

  • Backends developed with the "legacy custom backend" API are deprecated and must be ported to the new Triton Backend API.
  • Platform support varies across the different official backends; a "Backend-Platform Support Matrix" should be consulted.
Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), Christian Laforte Christian Laforte(Distinguished Engineer at NVIDIA; Former CTO at Stability AI), and
3 more.

lightning-hydra-template by ashleve

0.1%
5k
ML experimentation template using PyTorch Lightning + Hydra
Created 4 years ago
Updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Travis Fischer Travis Fischer(Founder of Agentic), and
2 more.

modelscope by modelscope

0.2%
8k
Model-as-a-Service library for model inference, training, and evaluation
Created 3 years ago
Updated 1 day ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), and
98 more.

tensorflow by tensorflow

0.1%
192k
Open-source ML framework
Created 10 years ago
Updated 15 hours ago
Feedback? Help us improve.