Helm chart for deploying Ollama on Kubernetes
Top 65.5% on sourcepulse
This Helm chart provides a Kubernetes deployment for Ollama, enabling users to run large language models locally within a cluster. It targets Kubernetes users, particularly those needing GPU acceleration for LLM inference, and simplifies the setup and management of Ollama instances.
How It Works
The chart deploys Ollama as a Kubernetes Deployment, allowing for configurable resource allocation, GPU integration (NVIDIA and AMD), and persistent storage via PersistentVolumeClaims. It supports pre-loading models at startup and creating models from templates, offering flexibility in LLM deployment.
Quick Start & Requirements
helm repo add otwld https://otwld.github.io/ollama-helm/
helm repo update
helm install ollama otwld/ollama --namespace ollama --create-namespace
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
4 days ago
1 day