Serving  by PaddlePaddle

Serving framework for online inference of PaddlePaddle models

Created 6 years ago
915 stars

Top 39.8% on SourcePulse

GitHubView on GitHub
Project Summary

Paddle Serving is a high-performance, flexible, and easy-to-use online inference service framework built on PaddlePaddle. It targets deep learning developers and enterprises seeking industrial-grade deployment solutions for machine learning models, offering low latency and high throughput.

How It Works

Paddle Serving integrates Paddle Inference and Paddle Lite for efficient serving and edge deployment. It offers two primary frameworks: a high-performance C++ Serving backend leveraging the bRPC network framework for optimal throughput and latency, and a user-friendly Python Pipeline framework built on gRPC/gRPC-Gateway for rapid development. Both support asynchronous, DAG-based pipelines for complex model compositions, concurrent inference, dynamic batching, and multi-card/multi-stream processing.

Quick Start & Requirements

  • Installation: Docker is strongly recommended. Native Linux installation and source compilation are also supported.
  • Prerequisites: Docker, Kubernetes (for cluster deployment), specific hardware drivers (Nvidia, Kunlun XPU, Huawei Ascend, etc.) for heterogeneous hardware support.
  • Resources: Detailed guides for various hardware and deployment scenarios are available.
  • Links:

Highlighted Details

  • Supports RESTful, gRPC, and bRPC protocols with C++, Python, and Java SDKs.
  • Optimized with Intel MKLDNN, Nvidia TensorRT, and low-precision quantization.
  • Provides model security features including encryption, authentication, and HTTPS gateways.
  • Offers distributed deployment for large-scale sparse parameter index models.

Maintenance & Community

  • Active community with QQ groups for discussion.
  • Contribution guidelines are provided, with numerous contributors acknowledged for specific features and examples.
  • Feedback and bug reports are managed via GitHub Issues.

Licensing & Compatibility

  • License: Apache 2.0 License.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The documentation is primarily in Simplified Chinese, with English resources available for contributions and some core concepts. While supporting a wide range of hardware, specific setup for each may require careful attention to the provided guides.

Health Check
Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
3 more.

LitServe by Lightning-AI

0.3%
4k
AI inference pipeline framework
Created 1 year ago
Updated 2 days ago
Starred by Bojan Tunguz Bojan Tunguz(AI Scientist; Formerly at NVIDIA), Alex Chen Alex Chen(Cofounder of Nexa AI), and
19 more.

ggml by ggml-org

0.3%
13k
Tensor library for machine learning
Created 3 years ago
Updated 2 days ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), and
98 more.

tensorflow by tensorflow

0.1%
192k
Open-source ML framework
Created 10 years ago
Updated 16 hours ago
Feedback? Help us improve.