kserve  by kserve

Kubernetes CRD for scalable ML model serving

Created 6 years ago
4,561 stars

Top 10.8% on SourcePulse

GitHubView on GitHub
Project Summary

KServe provides a standardized, cloud-agnostic platform for deploying and serving machine learning models on Kubernetes. It targets ML engineers and data scientists needing robust, scalable inference solutions, offering advanced features like autoscaling, canary rollouts, and support for both predictive and generative AI models.

How It Works

KServe leverages Kubernetes Custom Resource Definitions (CRDs) to manage ML model deployments. It abstracts away the complexities of networking, autoscaling, and health checks, enabling serverless inference with features like scale-to-zero. For high-density serving, it optionally integrates with ModelMesh. The platform supports a standardized inference protocol, including OpenAI specifications for generative models, ensuring framework interoperability.

Quick Start & Requirements

  • Installation: KServe offers several installation methods, including standalone (serverless via Knative, raw deployment, or ModelMesh) and as a Kubeflow addon. A quick local installation option is also available.
  • Prerequisites: Kubernetes cluster, optionally Knative for serverless features.
  • Resources: Specific resource requirements depend on the chosen installation and workload.
  • Links: KServe website, Kubeflow KServe documentation, InferenceService API Reference

Highlighted Details

  • Supports a wide range of ML frameworks (TensorFlow, XGBoost, ScikitLearn, PyTorch, Huggingface) and generative AI models.
  • Offers advanced deployment strategies like canary rollouts, pipelines, and ensembles via InferenceGraph.
  • Provides GPU autoscaling and scale-to-zero capabilities for efficient resource utilization.
  • Includes components for pre/post-processing, monitoring, and explainability.

Maintenance & Community

KServe is an active project with community support. Further details on contributors, roadmap, and community channels can be found on their website.

Licensing & Compatibility

KServe is released under the Apache License 2.0, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The "Raw Deployment Installation" option does not support canary deployments or request-based autoscaling with scale-to-zero. The project has undergone a rebranding from KFServing to KServe.

Health Check
Last Commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
42
Issues (30d)
20
Star History
106 stars in the last 30 days

Explore Similar Projects

Starred by Eugene Yan Eugene Yan(AI Scientist at AWS), Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), and
4 more.

seldon-core by SeldonIO

0.2%
5k
MLOps framework for production model deployment on Kubernetes
Created 7 years ago
Updated 13 hours ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

serve by pytorch

0.1%
4k
Serve, optimize, and scale PyTorch models in production
Created 6 years ago
Updated 1 month ago
Feedback? Help us improve.