binary-mlc-llm-libs  by mlc-ai

Pre-compiled LLM libraries for efficient deployment

Created 2 years ago
256 stars

Top 98.7% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides pre-compiled, quantized libraries for various Large Language Models (LLMs) designed for efficient deployment across diverse hardware platforms. It targets developers and researchers seeking to run LLMs locally with optimized performance and reduced resource requirements.

How It Works

The project stores model libraries in a structured naming convention: {model_name}/{model_name}-{quantization}-{metadata}-{platform}.{suffix}. Metadata includes context window size, sliding window size, and prefill chunk size, with defaults omitted for brevity. This approach allows for easy discovery and selection of pre-optimized model variants tailored for specific hardware and performance needs.

Highlighted Details

  • Supports popular models like Llama-3, Llama-2, Mistral, RedPajama, Phi, and GPT-2.
  • Includes quantization options for reduced memory footprint and faster inference.
  • Provides metadata for context window, sliding window, and prefill chunk sizes.
  • Organized by model, quantization, metadata, and target platform for streamlined selection.

Maintenance & Community

This repository appears to be a component of the broader MLC LLM project. Further community and maintenance details would likely be found on the main MLC LLM repository.

Licensing & Compatibility

The specific license for these binary libraries is not detailed in the provided README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The README does not specify the exact quantization methods used or the supported platforms beyond a general mention. The absence of explicit licensing information is a significant caveat for adoption.

Health Check
Last Commit

3 weeks ago

Responsiveness

1 week

Pull Requests (30d)
2
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

airllm by lyogavin

0.1%
6k
Inference optimization for LLMs on low-resource hardware
Created 2 years ago
Updated 2 weeks ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Alexander Borzunov Alexander Borzunov(Research Scientist at OpenAI), and
17 more.

gpt-oss by openai

0.7%
18k
Open-weight LLMs for reasoning and agents
Created 2 months ago
Updated 3 days ago
Feedback? Help us improve.