kserve by kserve

Kubernetes CRD for scalable ML model serving

Created 6 years ago

4,984 stars

Top 10.0% on SourcePulse

View on GitHub

5 Experts Love This Project

Zhuohan Li

Coauthor of vLLM

Jeff Hammerbacher

Cofounder of Cloudera

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Luis Capelo

Cofounder of Lightning AI

and 1 more!

Project Summary

KServe provides a standardized, cloud-agnostic platform for deploying and serving machine learning models on Kubernetes. It targets ML engineers and data scientists needing robust, scalable inference solutions, offering advanced features like autoscaling, canary rollouts, and support for both predictive and generative AI models.

How It Works

KServe leverages Kubernetes Custom Resource Definitions (CRDs) to manage ML model deployments. It abstracts away the complexities of networking, autoscaling, and health checks, enabling serverless inference with features like scale-to-zero. For high-density serving, it optionally integrates with ModelMesh. The platform supports a standardized inference protocol, including OpenAI specifications for generative models, ensuring framework interoperability.

Quick Start & Requirements

Installation: KServe offers several installation methods, including standalone (serverless via Knative, raw deployment, or ModelMesh) and as a Kubeflow addon. A quick local installation option is also available.
Prerequisites: Kubernetes cluster, optionally Knative for serverless features.
Resources: Specific resource requirements depend on the chosen installation and workload.
Links: KServe website, Kubeflow KServe documentation, InferenceService API Reference

Highlighted Details

Supports a wide range of ML frameworks (TensorFlow, XGBoost, ScikitLearn, PyTorch, Huggingface) and generative AI models.
Offers advanced deployment strategies like canary rollouts, pipelines, and ensembles via InferenceGraph.
Provides GPU autoscaling and scale-to-zero capabilities for efficient resource utilization.
Includes components for pre/post-processing, monitoring, and explainability.

Maintenance & Community

KServe is an active project with community support. Further details on contributors, roadmap, and community channels can be found on their website.

Licensing & Compatibility

KServe is released under the Apache License 2.0, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The "Raw Deployment Installation" option does not support canary deployments or request-based autoscaling with scale-to-zero. The project has undergone a rebranding from KFServing to KServe.

Health Check

Last Commit

3 days ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

90 stars in the last 30 days