Discover and explore top open-source AI tools and projects—updated daily.
Kubernetes operator for LLM serving
Top 95.0% on SourcePulse
OME addresses the complex challenge of managing and serving Large Language Models (LLMs) within Kubernetes environments. It targets engineers and researchers who need robust, automated solutions for deploying and optimizing LLM inference at scale, offering benefits like improved resource utilization and simplified operational overhead.
How It Works
OME operates as a Kubernetes operator, leveraging custom resources to define and manage LLMs as first-class citizens. It automates model parsing to extract critical metadata, intelligently selects optimal serving runtimes (like SGLang or Triton) based on model characteristics and weighted scoring, and orchestrates sophisticated deployment patterns. This approach optimizes GPU bin-packing and enables dynamic re-optimization for efficient resource utilization and high availability.
Quick Start & Requirements
helm upgrade --install ome-crd oci://ghcr.io/moirai-internal/charts/ome-crd --namespace ome --create-namespace
and helm upgrade --install ome oci://ghcr.io/moirai-internal/charts/ome-resources --namespace ome
helm repo add ome https://sgl-project.github.io/ome
, helm repo update
, then install CRDs and resources.Highlighted Details
BenchmarkJob
custom resource for performance evaluation.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
17 hours ago
Inactive