llmfit  by AlexsJones

LLM model selection and optimization tool for local hardware

Created 1 week ago

New!

4,265 stars

Top 11.4% on SourcePulse

GitHubView on GitHub
Project Summary

Summary llmfit is a terminal tool that addresses the challenge of running LLMs locally by providing a single command to discover which of its 94 models from 30 providers will run optimally on your specific hardware. It automatically detects system RAM, CPU, and GPU capabilities, then scores models across quality, speed, fit, and context dimensions. The tool offers an interactive TUI and a classic CLI, simplifying the process of finding suitable LLMs for your machine, and supports advanced features like multi-GPU setups and MoE architectures.

How It Works The tool probes system hardware (RAM, CPU, GPUs) and detects acceleration backends (CUDA, Metal, ROCm, SYCL, CPU). It uses a compiled-in database of 94 HuggingFace models. llmfit estimates memory requirements, crucially supporting Mixture-of-Experts (MoE) by calculating VRAM based on active experts. It employs dynamic quantization, selecting the highest quality quantization (e.g., Q8_0 down to Q2_K) that fits available memory. Models are scored across Quality, Speed, Fit, and Context dimensions, with weights adjusted per use-case, culminating in a composite ranking score.

Quick Start & Requirements Installation is straightforward across macOS, Linux, and Windows. For macOS/Linux, use the quick install script (curl -fsSL https://llmfit.axjns.dev/install.sh | sh) or Homebrew (brew tap AlexsJones/llmfit && brew install llmfit). Alternatively, install via Cargo (cargo install llmfit), which requires Rust/rustup. Building from source involves cloning the repo and running cargo build --release.

Highlighted Details

  • Interactive TUI and classic CLI modes for model discovery.
  • Automated hardware and acceleration backend detection.
  • Accurate VRAM estimation for MoE architectures.
  • Dynamic quantization for optimal model fit and quality.
  • Multi-dimensional scoring tailored to LLM use-cases.
  • JSON output for agent and scripting integration.
  • OpenClaw skill for hardware-aware local model recommendations and configuration.

Maintenance & Community Contributions are welcome, especially for expanding the model database. An automated script (make update-models) updates the HuggingFace model list. The llmfit-advisor skill integrates with the OpenClaw agent ecosystem.

Licensing & Compatibility llmfit is released under the permissive MIT license, allowing for commercial use and integration within closed-source projects.

Limitations & Caveats Speed metrics are estimated, not based on real-world benchmarking. VRAM reporting for AMD GPUs may be unknown. The model database is embedded and requires manual updates via provided scripts.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
42
Issues (30d)
38
Star History
4,365 stars in the last 10 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Coauthor of SGLang) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

llm-analysis by cli99

0%
477
CLI tool for LLM latency/memory analysis during training/inference
Created 2 years ago
Updated 10 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

rtp-llm by alibaba

0.3%
1k
LLM inference engine for diverse applications
Created 2 years ago
Updated 15 hours ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

xTuring by stochasticai

0.1%
3k
SDK for fine-tuning and customizing open-source LLMs
Created 2 years ago
Updated 5 days ago
Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

airllm by lyogavin

9.5%
13k
Inference optimization for LLMs on low-resource hardware
Created 2 years ago
Updated 5 months ago
Feedback? Help us improve.