kubedirector  by bluek8s

Kubernetes operator for stateful generative AI and ML infrastructure

Created 7 years ago
410 stars

Top 71.2% on SourcePulse

GitHubView on GitHub
Project Summary

Generative AI and ML infrastructure deployment on Kubernetes is simplified by KubeDirector, an operator designed for complex, stateful applications. It targets AI researchers, MLOps engineers, and data scientists, providing a production-ready platform to scale AI workloads from research to production, ensuring efficient GPU utilization and multi-tenancy.

How It Works

KubeDirector operates as a custom controller within Kubernetes, monitoring custom resources to define and manage application clusters. Its core design separates application deployment logic from the controller itself. Application experts define deployable applications using metadata and configuration artifacts, site administrators manage available application types, and end-users deploy and reconfigure clusters using standard Kubernetes tools. This architecture allows application experts to enable deployments without writing Go code or understanding controller internals, facilitating easier updates and management.

Quick Start & Requirements

Installation details are available in quickstart.md. Specific cloud provider notes for GKE and EKS are provided in gke-notes.md and eks-notes.md, respectively. While not explicitly stated, GPU optimization suggests GPUs are a key requirement for AI workloads.

Highlighted Details

  • Features seamless Ollama integration for deploying lightweight LLM inference servers (e.g., TinyLlama, Llama 2) via kubectl apply.
  • Supports a growing catalog of AI/ML applications, with upcoming additions including Stable Diffusion, vector databases (Weaviate, Pinecone, Chroma), fine-tuning frameworks, Jupyter environments, MLflow, and Ray clusters.
  • Designed for GPU-optimized clusters, enabling efficient utilization of expensive GPU resources across multiple workloads.
  • Provides multi-tenant AI workspaces for isolated team and project environments.

Maintenance & Community

Contributions are actively welcomed. Community interaction and feedback can occur on the BlueK8s Slack workspace. The project adheres to a Code of Conduct, and a CONTRIBUTING guide is available for potential contributors. GitHub Actions CI is used for testing.

Licensing & Compatibility

The license type and compatibility notes for commercial use are not explicitly detailed in the provided README.

Limitations & Caveats

No specific limitations, alpha status, known bugs, or unsupported platforms are mentioned in the provided README.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.