LLM-Calc  by RayFernando1337

LLM RAM calculator for inference optimization

Created 1 year ago
253 stars

Top 99.3% on SourcePulse

GitHubView on GitHub
Project Summary

Instantly calculate the maximum size of quantized language models that can fit in your available RAM, helping you optimize models for inference. This interactive React + TypeScript + Vite application targets engineers and researchers needing to quickly estimate LLM deployment feasibility based on hardware constraints.

How It Works

This project utilizes a modern frontend stack: React for an interactive user interface, TypeScript for type safety, and Vite for rapid development with Hot Module Replacement (HMR). The core calculation converts available RAM and estimated OS overhead from gigabytes to bytes. It then subtracts memory required for the specified context window size and converts the chosen quantization level (bits per parameter) into bytes per parameter. This allows for a precise estimation of the maximum number of model parameters that can fit within the remaining RAM.

Quick Start & Requirements

  • Install: bun install
  • Run: bun run dev
  • Prerequisites: Bun JavaScript runtime and package manager.
  • Access: Navigate to http://localhost:5173 in your browser.
  • Production Build: bun run build (output in dist directory).

Highlighted Details

  • Interactive UI for estimating LLM RAM requirements.
  • Considers available RAM, OS overhead, context window size, and quantization level.
  • Fast development environment powered by Vite and HMR.
  • Styling implemented with Tailwind CSS.

Maintenance & Community

Contributions are welcome via Pull Requests. The README does not specify maintainers, sponsorships, or community channels like Discord or Slack.

Licensing & Compatibility

Licensed under the MIT License, which permits commercial use and integration into closed-source projects.

Limitations & Caveats

The tool provides an estimation based on user-provided inputs for RAM, OS usage, and context window. Actual model performance and memory footprint may vary due to specific LLM architectures, runtime efficiencies, and other system-level factors not included in this calculation. This is a frontend calculator, not an inference engine.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

airllm by lyogavin

1.9%
14k
Inference optimization for LLMs on low-resource hardware
Created 2 years ago
Updated 3 days ago
Feedback? Help us improve.