swama  by Trans-N-ai

High-performance LLM inference engine for macOS

Created 8 months ago
492 stars

Top 62.9% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Swama is a high-performance LLM and VLM inference engine for macOS, built with pure Swift on Apple's MLX framework. It targets macOS users and developers seeking efficient, local AI model execution, offering an OpenAI-compatible API, a native menu bar app, and CLI tools for seamless integration and model management.

How It Works

Swama leverages Apple's MLX framework, optimized for Apple Silicon, to deliver fast local inference. Its architecture includes SwamaKit (core logic), Swama CLI (model management and inference), and Swama.app (menu bar UI). It supports multimodal inputs, local audio transcription via Whisper, and text embeddings, all accessible through an OpenAI-compatible API.

Quick Start & Requirements

  • Install: Download Swama.dmg from releases and install the app. Install CLI tools via the menu bar app.
  • Requirements: macOS 14.0+, Apple Silicon (M1/M2/M3/M4), Xcode 15.0+, Swift 6.1+.
  • Links: Releases

Highlighted Details

  • OpenAI-compatible API for chat completions, embeddings, and audio transcription.
  • Native macOS menu bar application for background inference and management.
  • Smart model management with aliases, automatic downloading, and caching from HuggingFace.
  • Supports multimodal (text/image) inputs and local audio transcription with Whisper.

Maintenance & Community

  • Active development with a clear roadmap.
  • Community support via GitHub Discussions.

Licensing & Compatibility

  • MIT License. Permissive for commercial use and closed-source linking.

Limitations & Caveats

  • Requires Apple Silicon hardware and recent macOS versions.
  • Building from source requires Xcode and Swift toolchain setup.
Health Check
Last Commit

2 weeks ago

Responsiveness

1 week

Pull Requests (30d)
1
Issues (30d)
1
Star History
23 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

moonshine by moonshine-ai

9.0%
4k
Speech-to-text models optimized for fast, accurate ASR on edge devices
Created 1 year ago
Updated 2 days ago
Feedback? Help us improve.