llmserve  by AlexsJones

Serve local LLMs with a TUI

Created 2 months ago
254 stars

Top 99.0% on SourcePulse

GitHubView on GitHub
Project Summary

A simple Text User Interface (TUI) designed to streamline the process of serving local Large Language Models (LLMs). It addresses the complexity of managing numerous models scattered across different formats (GGUF, MLX) and inference engines by auto-detecting available backends and model locations, allowing users to launch servers with minimal configuration. This tool is ideal for researchers and power users who frequently experiment with various local LLMs and need a quick, unified way to serve them.

How It Works

llmserve employs a TUI built with Ratatui and Crossterm, presenting a three-panel layout: Sources (model locations), Models (searchable model table), and Serve/Logs (running servers and live output). It automatically discovers installed inference engines like llama-server, KoboldCpp, LocalAI, MLX, Ollama, vLLM, and LM Studio. The application scans specified directories for model files, enabling users to select a model and backend, adjust essential parameters via presets (e.g., context size, GPU layers), and launch servers directly from the interface. A key advantage is the real-time streaming of backend stdout/stderr logs, providing immediate feedback and diagnostics.

Quick Start & Requirements

  • Primary install / run command:
    • macOS/Linux (curl): curl -fsSL https://llmserve.axjns.dev/install.sh | sh
    • Homebrew: brew tap AlexsJones/llmserve && brew install llmserve
    • Cargo: cargo install llmserve
  • Non-default prerequisites: Requires compatible inference engines (e.g., llama-server, KoboldCpp, LocalAI, MLX, Ollama, vLLM, LM Studio) to be installed and accessible on the system. MLX backend requires Python 3.x on macOS.
  • Links: Installation script URL: https://llmserve.axjns.dev/install.sh. Companion project llmfit is available for model suitability analysis.

Highlighted Details

  • Auto-detection of 7 distinct LLM inference backends.
  • Live, color-coded streaming of inference backend logs directly within the TUI.
  • Support for per-backend configuration presets (context size, batch size, GPU layers, threads, extra CLI arguments).
  • Capability to serve multiple models concurrently across different backends and ports.
  • Automatic detection of vision model projector files (.mmproj) for relevant backends.
  • Multiple TUI themes (7 available) for customization.

Maintenance & Community

No specific details regarding maintainers, sponsorships, or community channels (like Discord or Slack) are provided in the README.

Licensing & Compatibility

  • License type: MIT License.
  • Compatibility notes: The permissive MIT license generally allows for commercial use and integration within closed-source projects without significant restrictions.

Limitations & Caveats

llmserve functions as a front-end interface and relies on the user having the actual LLM inference backends installed and configured separately. Backends like Ollama, vLLM, and LM Studio are detected but manage their own model registries and cannot serve local files directly through llmserve. The TUI nature makes it less suitable for automated or scripted deployments compared to pure CLI tools.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
0
Star History
34 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
50 more.

ollama by ollama

0.3%
172k
CLI tool for running LLMs locally
Created 2 years ago
Updated 18 hours ago
Feedback? Help us improve.