ollama-helm by otwld

Helm chart for deploying Ollama on Kubernetes

Created 2 years ago

531 stars

Top 59.6% on SourcePulse

1 Expert Loves This Project

jmorganca

Cofounder of Ollama

Project Summary

This Helm chart provides a Kubernetes deployment for Ollama, enabling users to run large language models locally within a cluster. It targets Kubernetes users, particularly those needing GPU acceleration for LLM inference, and simplifies the setup and management of Ollama instances.

How It Works

The chart deploys Ollama as a Kubernetes Deployment, allowing for configurable resource allocation, GPU integration (NVIDIA and AMD), and persistent storage via PersistentVolumeClaims. It supports pre-loading models at startup and creating models from templates, offering flexibility in LLM deployment.

Quick Start & Requirements

Install:

helm repo add otwld https://otwld.github.io/ollama-helm/
helm repo update
helm install ollama otwld/ollama --namespace ollama --create-namespace

Requirements: Kubernetes >= 1.16.0-0 (CPU), >= 1.26.0-0 (GPU). GPU support requires specific NVIDIA or AMD drivers and compatible hardware.
Docs: Ollama Documentation, Ollama-Helm Chart

Highlighted Details

GPU support for NVIDIA and AMD, including MIG for NVIDIA.
Ability to pull and run specified models on startup.
Support for creating models from templates or ConfigMaps.
Optional Ingress configuration for external access.
Persistent storage for Ollama data.

Maintenance & Community

Maintained by Jean Baptiste Detroyes and Nathan Tréhout.
Community support via OTWLD Discord and Ollama-Helm GitHub issues.

Licensing & Compatibility

The chart itself is typically licensed under a permissive license (e.g., Apache 2.0, though not explicitly stated in the README). Ollama's underlying license should be consulted for specific usage terms.

Limitations & Caveats

GPU support may vary depending on specific hardware and Kubernetes versions. Not all GPUs are guaranteed to be supported, especially AMD.
Upgrading from older chart versions (0.X.X to 1.X.X) requires migration of model configuration.

Health Check

Last Commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)

3

Issues (30d)

0

Star History

7 stars in the last 30 days

Explore Similar Projects

openmodelz by tensorchord

CLI tool for autoscaling LLM inference on Kubernetes (and other clusters)

Created 2 years ago

Updated 2 years ago

homelab by vehagn

Infrastructure-as-code repo for a Kubernetes homelab

Created 3 years ago

Updated 1 day ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory).

vllm-playground by micytao

Modern web UI for vLLM LLM serving

Created 2 months ago

Updated 2 days ago

Starred by

Ettore Di Giacinto

Ettore Di Giacinto(Author of LocalAI).

aikit by kaito-project

AIKit: platform for LLM hosting, fine-tuning, and deployment

Created 2 years ago

Updated 1 week ago

ServerlessLLM by ServerlessLLM

Open-source framework for serverless LLM deployment

Created 1 year ago

Updated 2 days ago

k8m by weibaohui

Mini Kubernetes AI Dashboard for simplified cluster management

Created 1 year ago

Updated 18 hours ago

k8sgpt-operator by k8sgpt-ai

Kubernetes operator for managed K8sGPT workloads

Created 2 years ago

Updated 2 days ago

kaito by kaito-project

Kubernetes operator for AI/ML model inference and tuning

Created 2 years ago

Updated 2 days ago

Starred by

Jeffrey Morgan

Jeffrey Morgan(Cofounder of Ollama).

ollama-docker by mythrantic

Docker compose setup for Ollama deployment

Created 1 year ago

Updated 1 week ago

Starred by

Casper Hansen

Casper Hansen(Author of AutoAWQ) and

Zhen Lu

Zhen Lu(Cofounder of Runpod).

worker-vllm by runpod-workers

RunPod worker template for blazing-fast LLM endpoints

Created 2 years ago

Updated 2 days ago

Starred by

Chaoyu Yang

Chaoyu Yang(Founder of Bento),

Junyang Lin

Junyang Lin(Core Maintainer at Alibaba Qwen), and

12 more.

OpenLLM by bentoml

SDK for running open-source LLMs as OpenAI-compatible APIs

Created 2 years ago

Updated 2 weeks ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and

27 more.

open-webui by open-webui

Self-hosted AI platform for local LLM deployment

Created 2 years ago

Updated 1 day ago

Feedback? Help us improve.