kserve  by kserve

Kubernetes CRD for scalable ML model serving

created 6 years ago
4,401 stars

Top 11.4% on sourcepulse

GitHubView on GitHub
Project Summary

KServe provides a standardized, cloud-agnostic platform for deploying and serving machine learning models on Kubernetes. It targets ML engineers and data scientists needing robust, scalable inference solutions, offering advanced features like autoscaling, canary rollouts, and support for both predictive and generative AI models.

How It Works

KServe leverages Kubernetes Custom Resource Definitions (CRDs) to manage ML model deployments. It abstracts away the complexities of networking, autoscaling, and health checks, enabling serverless inference with features like scale-to-zero. For high-density serving, it optionally integrates with ModelMesh. The platform supports a standardized inference protocol, including OpenAI specifications for generative models, ensuring framework interoperability.

Quick Start & Requirements

  • Installation: KServe offers several installation methods, including standalone (serverless via Knative, raw deployment, or ModelMesh) and as a Kubeflow addon. A quick local installation option is also available.
  • Prerequisites: Kubernetes cluster, optionally Knative for serverless features.
  • Resources: Specific resource requirements depend on the chosen installation and workload.
  • Links: KServe website, Kubeflow KServe documentation, InferenceService API Reference

Highlighted Details

  • Supports a wide range of ML frameworks (TensorFlow, XGBoost, ScikitLearn, PyTorch, Huggingface) and generative AI models.
  • Offers advanced deployment strategies like canary rollouts, pipelines, and ensembles via InferenceGraph.
  • Provides GPU autoscaling and scale-to-zero capabilities for efficient resource utilization.
  • Includes components for pre/post-processing, monitoring, and explainability.

Maintenance & Community

KServe is an active project with community support. Further details on contributors, roadmap, and community channels can be found on their website.

Licensing & Compatibility

KServe is released under the Apache License 2.0, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The "Raw Deployment Installation" option does not support canary deployments or request-based autoscaling with scale-to-zero. The project has undergone a rebranding from KFServing to KServe.

Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
54
Issues (30d)
23
Star History
291 stars in the last 90 days

Explore Similar Projects

Starred by Eugene Yan Eugene Yan(AI Scientist at AWS), Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), and
3 more.

seldon-core by SeldonIO

0.1%
5k
MLOps framework for production model deployment on Kubernetes
created 7 years ago
updated 1 day ago
Feedback? Help us improve.