mlx-llm  by riccardomusmeci

LLM tools/apps for Apple Silicon using MLX

Created 1 year ago
454 stars

Top 66.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a Python library for running Large Language Models (LLMs) on Apple Silicon using the MLX framework, enabling real-time inference and applications. It targets developers and researchers working with Apple hardware who need efficient LLM deployment.

How It Works

The library leverages Apple's MLX framework, which is designed for efficient tensor computations on Apple Silicon. It offers a streamlined API for loading pre-trained models from HuggingFace, quantizing them for reduced memory footprint and faster inference, and extracting embeddings. The architecture supports direct integration with MLX's array operations for custom model manipulation and fine-tuning.

Quick Start & Requirements

Highlighted Details

  • Supports a wide range of LLM families including LLaMA, Mistral, Phi3, Gemma, and OpenELM.
  • Enables quantization to 4-bit for significant performance gains.
  • Provides utilities for extracting model embeddings.
  • Includes a chat interface for interactive LLM conversations.

Maintenance & Community

  • Maintained by riccardomusmeci.
  • Contact email provided for questions.

Licensing & Compatibility

  • License not explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The OpenELM chat-mode is noted as broken and under active development for a fix. The README does not specify the exact license, which may impact commercial adoption.

Health Check
Last Commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Georgi Gerganov Georgi Gerganov(Author of llama.cpp, whisper.cpp), and
1 more.

LLMFarm by guinmoon

0.4%
2k
iOS/MacOS app for local LLM inference
Created 2 years ago
Updated 1 month ago
Starred by Didier Lopes Didier Lopes(Founder of OpenBB), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
5 more.

mlx-lm by ml-explore

26.1%
2k
Python package for LLM text generation and fine-tuning on Apple silicon
Created 6 months ago
Updated 22 hours ago
Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

neural-compressor by intel

0.2%
2k
Python library for model compression (quantization, pruning, distillation, NAS)
Created 5 years ago
Updated 15 hours ago
Feedback? Help us improve.