swama  by Trans-N-ai

High-performance LLM inference engine for macOS

created 1 month ago
359 stars

Top 79.1% on sourcepulse

GitHubView on GitHub
Project Summary

Swama is a high-performance LLM and VLM inference engine for macOS, built with pure Swift on Apple's MLX framework. It targets macOS users and developers seeking efficient, local AI model execution, offering an OpenAI-compatible API, a native menu bar app, and CLI tools for seamless integration and model management.

How It Works

Swama leverages Apple's MLX framework, optimized for Apple Silicon, to deliver fast local inference. Its architecture includes SwamaKit (core logic), Swama CLI (model management and inference), and Swama.app (menu bar UI). It supports multimodal inputs, local audio transcription via Whisper, and text embeddings, all accessible through an OpenAI-compatible API.

Quick Start & Requirements

  • Install: Download Swama.dmg from releases and install the app. Install CLI tools via the menu bar app.
  • Requirements: macOS 14.0+, Apple Silicon (M1/M2/M3/M4), Xcode 15.0+, Swift 6.1+.
  • Links: Releases

Highlighted Details

  • OpenAI-compatible API for chat completions, embeddings, and audio transcription.
  • Native macOS menu bar application for background inference and management.
  • Smart model management with aliases, automatic downloading, and caching from HuggingFace.
  • Supports multimodal (text/image) inputs and local audio transcription with Whisper.

Maintenance & Community

  • Active development with a clear roadmap.
  • Community support via GitHub Discussions.

Licensing & Compatibility

  • MIT License. Permissive for commercial use and closed-source linking.

Limitations & Caveats

  • Requires Apple Silicon hardware and recent macOS versions.
  • Building from source requires Xcode and Swift toolchain setup.
Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
15
Issues (30d)
13
Star History
365 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.