backend by triton-inference-server

Triton backend tools for model execution

Created 5 years ago

366 stars

Top 77.0% on SourcePulse

Project Summary

This repository provides common source, scripts, and utilities for developing custom backends for the Triton Inference Server. It targets developers building custom inference logic or integrating new frameworks with Triton, enabling efficient model execution and pre/post-processing.

How It Works

Backends are implemented as shared libraries adhering to the Triton Backend API, which defines interfaces for managing backend, model, and instance lifecycles, as well as handling inference requests and responses. This API allows backends to interact with Triton for request processing, tensor data access, and response generation, supporting both single and decoupled response patterns.

Quick Start & Requirements

Build: mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX:PATH=$(pwd)/install .. && make install
Dependencies: Requires Triton's common and core repositories. Specific tags can be set via CMake arguments (e.g., -DTRITON_COMMON_REPO_TAG=[tag]).
Integration: The utilities are typically included in a backend's build via CMakeLists.txt, rather than building this repository directly.

Highlighted Details

Supports a wide range of existing backends including TensorRT, ONNX Runtime, TensorFlow, PyTorch, OpenVINO, DALI, FIL, TensorRT-LLM, and vLLM.
Provides a Python backend option for custom Python-based pre/post-processing or direct execution of Python scripts.
The Triton Backend API is C-based, offering fine-grained control over model execution and request handling.
Supports decoupled responses, allowing for out-of-order and multiple responses per request.

Maintenance & Community

This repository is part of the Triton Inference Server project. General questions can be directed to the main Triton issues page.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, Triton Inference Server itself is typically Apache 2.0 licensed, suggesting potential compatibility with commercial and closed-source applications.

Limitations & Caveats

Backends developed with the "legacy custom backend" API are deprecated and must be ported to the new Triton Backend API.
Platform support varies across the different official backends; a "Backend-Platform Support Matrix" should be consulted.

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days