llmfarm_core.swift by guinmoon

Swift library for local LLM inference

Created 2 years ago

276 stars

Top 93.9% on SourcePulse

Project Summary

This Swift library provides a framework for loading and running large language models (LLMs) like Llama on macOS and iOS. It's designed for developers and researchers interested in on-device LLM inference, leveraging Metal for accelerated computation on Apple Silicon.

How It Works

The library is built upon ggml and llama.cpp, enabling efficient LLM execution. It supports various inference and sampling methods, including temperature, top-k, top-p, Tail Free Sampling (TFS), Locally Typical Sampling, Mirostat, and greedy decoding. The architecture is optimized for Apple's Metal framework for GPU acceleration, specifically targeting Apple Silicon hardware.

Quick Start & Requirements

Install via Swift Package Manager: https://github.com/guinmoon/llmfarm_core.swift
Requirements: macOS 13+, iOS 16+. Metal support requires Apple Silicon (Intel Macs are not supported for Metal).
See examples in the Demo Project.

Highlighted Details

Supports macOS (13+) and iOS (16+).
Leverages Metal for GPU acceleration on Apple Silicon.
Implements multiple sampling methods (Temperature, TFS, Mirostat, etc.).
Based on ggml and llama.cpp.

Maintenance & Community

The project is under active revision and refactoring, with the author learning Swift during development. Feedback on code style and architecture is welcomed.

Licensing & Compatibility

The license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The code is in constant revision and may not be stable. Support for LoRA adapters (training, export, and context restoration) is currently missing. Metal acceleration is not functional on Intel Macs.

Health Check

Last Commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)

0

Issues (30d)

0

Star History

3 stars in the last 30 days

Explore Similar Projects

Starred by

Benjamin Bolte

Benjamin Bolte(Cofounder of K-Scale Labs).

KVSplit by dipampaul17

CLI tool for differentiated KV cache quantization on Apple Silicon

Created 8 months ago

Updated 7 months ago

Starred by

Jeffrey Morgan

Jeffrey Morgan(Cofounder of Ollama).

ollama-benchmark by aidatatools

CLI tool for local LLM throughput benchmarking via Ollama

Created 2 years ago

Updated 1 week ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera) and

Wing Lian

Wing Lian(Founder of Axolotl AI).

LLaMA_MPS by jankais3r

LLM inference on Apple Silicon GPUs

Created 2 years ago

Updated 2 years ago

Starred by

Alex Chen

Alex Chen(Cofounder of Nexa AI),

Luis Capelo

Luis Capelo(Cofounder of Lightning AI), and

1 more.

Anemll by Anemll

Framework for porting LLMs to Apple Neural Engine (ANE)

Created 1 year ago

Updated 1 week ago

Starred by

Jeremy Howard

Jeremy Howard(Cofounder of fast.ai) and

Zhuohan Li

Zhuohan Li(Coauthor of vLLM).

marlin by IST-DASLab

FP16xINT4 kernel for fast LLM inference

Created 2 years ago

Updated 1 year ago

InferLLM by MegEngine

Lightweight LLM inference framework

Created 2 years ago

Updated 1 year ago

Starred by

Junyang Lin

Junyang Lin(Core Maintainer at Alibaba Qwen),

Georgi Gerganov

Georgi Gerganov(Author of llama.cpp, whisper.cpp), and

1 more.

LLMFarm by guinmoon

iOS/MacOS app for local LLM inference

Created 2 years ago

Updated 1 month ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

6 more.

xTuring by stochasticai

SDK for fine-tuning and customizing open-source LLMs

Created 2 years ago

Updated 1 week ago

Starred by

Woosuk Kwon

Woosuk Kwon(Coauthor of vLLM),

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera), and

1 more.

intel-extension-for-pytorch by intel

PyTorch extension for performance boost on Intel platforms

Created 5 years ago

Updated 2 days ago

Starred by

Lianmin Zheng

Lianmin Zheng(Coauthor of SGLang, vLLM),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

1 more.

MiniCPM by OpenBMB

Ultra-efficient LLMs for end devices, achieving 5x+ speedup

Created 1 year ago

Updated 3 months ago

Starred by

Travis Fischer

Travis Fischer(Founder of Agentic),

Didier Lopes

Didier Lopes(Founder of OpenBB), and

6 more.

ipex-llm by intel

LLM acceleration library for Intel XPU (GPU, NPU, CPU)

Created 9 years ago

Updated 2 months ago

Starred by

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n),

Nat Friedman

Nat Friedman(Former CEO of GitHub), and

54 more.

llama.cpp by ggml-org

C/C++ library for local LLM inference

Created 2 years ago

Updated 14 hours ago

Feedback? Help us improve.