Discover and explore top open-source AI tools and projects—updated daily.
jjang-aiLocal AI engine for Apple Silicon
Top 70.4% on SourcePulse
Summary
vMLX is a local AI engine for Apple Silicon Macs, enabling users to run LLMs, VLMs, and image generation models entirely on-device. It provides an OpenAI, Anthropic, and Ollama-compatible API, ensuring privacy and eliminating cloud dependencies. This offers a high-performance, local inference solution for developers and power users.
How It Works
Built on Apple's MLX framework, vMLX optimizes inference for Metal GPUs. Its key innovation is "JANG" adaptive mixed-precision quantization, achieving superior accuracy at lower bitwidths compared to standard MLX quantization. A sophisticated 5-layer caching architecture (continuous batching, prefix cache, paged KV cache, disk cache) drastically reduces latency. For models exceeding single-machine capacity, vMLX supports pipeline parallelism across multiple Macs.
Quick Start & Requirements
Install via pip install vmlx or uv tool install vmlx. Serve a model with vmlx serve mlx-community/Qwen3-8B-4bit. Requires Apple Silicon hardware. macOS 14+ users should use uv, pipx, or a virtual environment for installation. MLX Studio, a native macOS GUI app, is also available.
Highlighted Details
Maintenance & Community
Developed by Jinho Jang (JANGQ AI). While specific community channels are not detailed, the project appears actively maintained.
Licensing & Compatibility
Distributed under the permissive Apache License 2.0, permitting commercial use and integration into closed-source applications.
Limitations & Caveats
Primarily optimized for Apple Silicon Macs. Smelt mode is mutually exclusive with VLM capabilities and requires JANG-formatted MoE models. Some features, like Qwen Image Edit, have high RAM requirements (~54 GB). Bare pip installations may face issues on macOS 14+.
1 day ago
Inactive
lmstudio-ai