High-performance LLM inference engine for macOS
Top 79.1% on sourcepulse
Swama is a high-performance LLM and VLM inference engine for macOS, built with pure Swift on Apple's MLX framework. It targets macOS users and developers seeking efficient, local AI model execution, offering an OpenAI-compatible API, a native menu bar app, and CLI tools for seamless integration and model management.
How It Works
Swama leverages Apple's MLX framework, optimized for Apple Silicon, to deliver fast local inference. Its architecture includes SwamaKit (core logic), Swama CLI (model management and inference), and Swama.app (menu bar UI). It supports multimodal inputs, local audio transcription via Whisper, and text embeddings, all accessible through an OpenAI-compatible API.
Quick Start & Requirements
Swama.dmg
from releases and install the app. Install CLI tools via the menu bar app.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 days ago
Inactive