model_server by openvinotoolkit

Scalable inference server for OpenVINO-optimized models

Created 7 years ago

810 stars

Top 43.7% on SourcePulse

Project Summary

OpenVINO™ Model Server provides a scalable inference solution for models optimized with OpenVINO™. It enables remote inference, allowing lightweight clients to interact with models deployed on edge or cloud infrastructure via REST or gRPC, abstracting away framework and hardware dependencies. This makes it ideal for microservices and cloud-native applications, offering efficient resource utilization and simplified model management.

How It Works

The server hosts OpenVINO™-optimized models, exposing them through gRPC or REST APIs, mirroring TensorFlow Serving and KServe interfaces. It supports various frameworks (TensorFlow, PaddlePaddle, ONNX) and accelerators, with a Directed Acyclic Graph (DAG) scheduler for complex pipelines and custom nodes. Models can be managed dynamically, including versioning and runtime updates, with metrics compatible with Prometheus.

Quick Start & Requirements

Install/Run: Docker images are available on Docker Hub.
Prerequisites: Tested on RedHat, Ubuntu, and Windows.
Resources: Quick-start guides for vision and LLM use cases are available.
Links: QuickStart, LLM QuickStart

Highlighted Details

Native Windows support.
Text Embeddings and Reranking compatible with OpenAI and Cohere APIs.
Efficient Text Generation via OpenAI API.
gRPC streaming, MediaPipe graphs serving, and Python code execution.

Maintenance & Community

Binary packages for Linux and Windows are available on GitHub.
Submit questions, feature requests, or bug reports via GitHub issues.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project's licensing is not clearly stated in the README, which may impact commercial adoption. Specific hardware requirements or performance benchmarks beyond general optimization claims are not detailed.

Health Check

Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days