dockerLLM  by TheBlokeAI

Docker templates for local LLMs

created 2 years ago
305 stars

Top 88.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides Dockerfiles for deploying large language models (LLMs) via the text-generation-webui interface, targeting users who want a streamlined, pre-configured environment for running various LLM backends. It simplifies the setup of complex AI inference environments, particularly on platforms like Runpod.

How It Works

The project leverages Docker to encapsulate text-generation-webui and its dependencies, including optimized backends like AutoGPTQ, ExLlama, and GGML. This approach ensures consistent environments and simplifies deployment across different hardware configurations, especially those with NVIDIA GPUs. Key advantages include automatic updates of ExLlama and text-generation-webui on boot, and support for multiple GPU acceleration methods.

Quick Start & Requirements

  • Install/Run: Deploy via Runpod template links provided in the README.
  • Prerequisites: NVIDIA GPU with CUDA 12.1.1 (though container name may reflect older versions).
  • Setup: Deployment via Runpod templates is typically quick, with model loading configured via environment variables.
  • Docs: Runpod: TheBloke's Local LLMs UI and Runpod: TheBloke's Local LLMs UI & API.

Highlighted Details

  • Supports Mixtral, Llama 2 (including 70B), and GPTQ models.
  • Integrates AutoGPTQ, ExLlama (2x faster for Llama 4bit GPTQs), and CUDA-accelerated GGML.
  • Includes all text-generation-webui extensions (Chat, SuperBooga, Whisper).
  • Automatic model download/loading via MODEL env var; UI parameters via UI_ARGS.

Maintenance & Community

The project is maintained by "TheBloke," a prominent figure in the LLM community known for quantizing and distributing models. Updates are frequent, indicating active maintenance.

Licensing & Compatibility

The repository contains Dockerfiles, which are generally permissive. However, the underlying software (text-generation-webui, LLM libraries) will have their own licenses. Compatibility for commercial use depends on the licenses of the included LLM frameworks and models.

Limitations & Caveats

The container naming convention may not always reflect the latest CUDA version used internally, which could be confusing. Support is primarily focused on Runpod instances, though the Dockerfiles themselves could potentially be adapted for other environments.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
13 more.

open-webui by open-webui

0.9%
105k
Self-hosted AI platform for local LLM deployment
created 1 year ago
updated 1 day ago
Feedback? Help us improve.