mlx-engine  by lmstudio-ai

Apple MLX engine for LM Studio

Created 1 year ago
808 stars

Top 43.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides the Apple MLX engine for LM Studio, enabling efficient large language model (LLM) inference on Apple Silicon hardware. It targets Mac users of LM Studio, offering a pre-bundled, optimized solution for running various LLMs, including text and vision models, with features like speculative decoding.

How It Works

The engine leverages Apple's MLX framework, a high-performance array library designed for machine learning on Apple hardware. It integrates with mlx-lm for text generation and mlx-vlm for vision-language tasks, facilitating direct inference without complex setup. The architecture supports features like speculative decoding to accelerate inference speeds by using a smaller, faster draft model to predict tokens.

Quick Start & Requirements

  • Install: LM Studio 0.3.4+ includes mlx-engine pre-bundled. For standalone use, clone the repo and install dependencies: pip install -U -r requirements.txt.
  • Prerequisites: macOS 14.0 (Sonoma) or greater, Python 3.11.
  • Demo: Run python demo.py --model <model_name> for text models or add --images for vision models. Download models using the lms CLI tool.
  • Docs: LM Studio CLI Documentation

Highlighted Details

  • Supports text and vision LLMs (Llama-3.2-Vision, Pixtral, Qwen2-VL, Llava-v1.6).
  • Implements speculative decoding for accelerated inference.
  • Includes pre-commit hooks for code quality and pytest for testing.

Maintenance & Community

The project is maintained by lmstudio-ai. Community support and updates are typically channeled through LM Studio's official platforms.

Licensing & Compatibility

  • mlx-engine itself is not explicitly licensed in the README, but its core components mlx-lm, mlx-vlm are MIT licensed, and Outlines is Apache 2.0.
  • Compatible with LM Studio for Mac.

Limitations & Caveats

The engine is exclusively for macOS with Apple Silicon and requires a specific Python version (3.11). The standalone usage requires manual installation of dependencies.

Health Check
Last Commit

6 days ago

Responsiveness

1 day

Pull Requests (30d)
9
Issues (30d)
3
Star History
21 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

dots.llm1 by rednote-hilab

0%
466
MoE model for research
Created 5 months ago
Updated 2 months ago
Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Georgi Gerganov Georgi Gerganov(Author of llama.cpp, whisper.cpp), and
1 more.

LLMFarm by guinmoon

0.2%
2k
iOS/MacOS app for local LLM inference
Created 2 years ago
Updated 1 month ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
8 more.

EAGLE by SafeAILab

0.5%
2k
Speculative decoding research paper for faster LLM inference
Created 1 year ago
Updated 3 weeks ago
Feedback? Help us improve.