llmfit  by AlexsJones

LLM model selection and optimization tool for local hardware

Created 1 month ago
22,473 stars

Top 2.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary llmfit is a terminal tool that addresses the challenge of running LLMs locally by providing a single command to discover which of its 94 models from 30 providers will run optimally on your specific hardware. It automatically detects system RAM, CPU, and GPU capabilities, then scores models across quality, speed, fit, and context dimensions. The tool offers an interactive TUI and a classic CLI, simplifying the process of finding suitable LLMs for your machine, and supports advanced features like multi-GPU setups and MoE architectures.

How It Works The tool probes system hardware (RAM, CPU, GPUs) and detects acceleration backends (CUDA, Metal, ROCm, SYCL, CPU). It uses a compiled-in database of 94 HuggingFace models. llmfit estimates memory requirements, crucially supporting Mixture-of-Experts (MoE) by calculating VRAM based on active experts. It employs dynamic quantization, selecting the highest quality quantization (e.g., Q8_0 down to Q2_K) that fits available memory. Models are scored across Quality, Speed, Fit, and Context dimensions, with weights adjusted per use-case, culminating in a composite ranking score.

Quick Start & Requirements Installation is straightforward across macOS, Linux, and Windows. For macOS/Linux, use the quick install script (curl -fsSL https://llmfit.axjns.dev/install.sh | sh) or Homebrew (brew tap AlexsJones/llmfit && brew install llmfit). Alternatively, install via Cargo (cargo install llmfit), which requires Rust/rustup. Building from source involves cloning the repo and running cargo build --release.

Highlighted Details

  • Interactive TUI and classic CLI modes for model discovery.
  • Automated hardware and acceleration backend detection.
  • Accurate VRAM estimation for MoE architectures.
  • Dynamic quantization for optimal model fit and quality.
  • Multi-dimensional scoring tailored to LLM use-cases.
  • JSON output for agent and scripting integration.
  • OpenClaw skill for hardware-aware local model recommendations and configuration.

Maintenance & Community Contributions are welcome, especially for expanding the model database. An automated script (make update-models) updates the HuggingFace model list. The llmfit-advisor skill integrates with the OpenClaw agent ecosystem.

Licensing & Compatibility llmfit is released under the permissive MIT license, allowing for commercial use and integration within closed-source projects.

Limitations & Caveats Speed metrics are estimated, not based on real-world benchmarking. VRAM reporting for AMD GPUs may be unknown. The model database is embedded and requires manual updates via provided scripts.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
170
Issues (30d)
34
Star History
7,341 stars in the last 30 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Coauthor of SGLang) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

llm-analysis by cli99

0%
485
CLI tool for LLM latency/memory analysis during training/inference
Created 2 years ago
Updated 11 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

rtp-llm by alibaba

0.3%
1k
LLM inference engine for diverse applications
Created 2 years ago
Updated 22 hours ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

xTuring by stochasticai

0.0%
3k
SDK for fine-tuning and customizing open-source LLMs
Created 3 years ago
Updated 1 month ago
Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

airllm by lyogavin

2.5%
15k
Inference optimization for LLMs on low-resource hardware
Created 2 years ago
Updated 1 month ago
Feedback? Help us improve.