llmserve by AlexsJones

Serve local LLMs with a TUI

Created 3 months ago

303 stars

Top 87.9% on SourcePulse

Project Summary

A simple Text User Interface (TUI) designed to streamline the process of serving local Large Language Models (LLMs). It addresses the complexity of managing numerous models scattered across different formats (GGUF, MLX) and inference engines by auto-detecting available backends and model locations, allowing users to launch servers with minimal configuration. This tool is ideal for researchers and power users who frequently experiment with various local LLMs and need a quick, unified way to serve them.

How It Works

llmserve employs a TUI built with Ratatui and Crossterm, presenting a three-panel layout: Sources (model locations), Models (searchable model table), and Serve/Logs (running servers and live output). It automatically discovers installed inference engines like llama-server, KoboldCpp, LocalAI, MLX, Ollama, vLLM, and LM Studio. The application scans specified directories for model files, enabling users to select a model and backend, adjust essential parameters via presets (e.g., context size, GPU layers), and launch servers directly from the interface. A key advantage is the real-time streaming of backend stdout/stderr logs, providing immediate feedback and diagnostics.

Quick Start & Requirements

Primary install / run command:
- macOS/Linux (curl): curl -fsSL https://llmserve.axjns.dev/install.sh | sh
- Homebrew: brew tap AlexsJones/llmserve && brew install llmserve
- Cargo: cargo install llmserve
Non-default prerequisites: Requires compatible inference engines (e.g., llama-server, KoboldCpp, LocalAI, MLX, Ollama, vLLM, LM Studio) to be installed and accessible on the system. MLX backend requires Python 3.x on macOS.
Links: Installation script URL: https://llmserve.axjns.dev/install.sh. Companion project llmfit is available for model suitability analysis.

Highlighted Details

Auto-detection of 7 distinct LLM inference backends.
Live, color-coded streaming of inference backend logs directly within the TUI.
Support for per-backend configuration presets (context size, batch size, GPU layers, threads, extra CLI arguments).
Capability to serve multiple models concurrently across different backends and ports.
Automatic detection of vision model projector files (.mmproj) for relevant backends.
Multiple TUI themes (7 available) for customization.

Maintenance & Community

No specific details regarding maintainers, sponsorships, or community channels (like Discord or Slack) are provided in the README.

Licensing & Compatibility

License type: MIT License.
Compatibility notes: The permissive MIT license generally allows for commercial use and integration within closed-source projects without significant restrictions.

Limitations & Caveats

llmserve functions as a front-end interface and relies on the user having the actual LLM inference backends installed and configured separately. Backends like Ollama, vLLM, and LM Studio are detected but manage their own model registries and cannot serve local files directly through llmserve. The TUI nature makes it less suitable for automated or scripted deployments compared to pure CLI tools.

llmserve by AlexsJones

Explore Similar Projects

Kolo by MaxHastings

llm-gpt4all by simonw

Sakura_Launcher_GUI by PiDanShouRouZhouXD

bosquet by zmedelis

distill by samuelfaj

mcp-client-for-ollama by jonigl

local-studio by sybil-solutions

ollama-mcp-bridge by patruff

mcphost by mark3labs

mcp-cli by IBM

mcp-go by mark3labs

ollama by ollama