off-grid-mobile  by alichherawalla

Multimodal AI suite for on-device, private mobile operation

Created 3 weeks ago

New!

615 stars

Top 53.4% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a comprehensive, privacy-first AI suite for mobile devices, enabling users to chat, generate images, process vision inputs, and transcribe speech entirely offline. Targeting privacy-conscious users and power users, it offers advanced AI capabilities without any data leaving the device, acting as a versatile on-device AI toolkit.

How It Works

Off Grid integrates multiple on-device AI models and inference engines. It supports various GGUF-formatted Large Language Models (LLMs) like Qwen 3, Llama 3.2, Gemma 3, and Phi-4 for text generation, leveraging llama.cpp and llama.rn. Image generation is powered by on-device Stable Diffusion models, accelerated by NPU (Snapdragon) or Core ML (iOS). Vision AI tasks utilize models such as SmolVLM, Qwen3-VL, and Gemma 3n, while voice transcription is handled by whisper.cpp and whisper.rn. This architecture ensures all processing occurs locally, preserving user privacy and enabling functionality without an internet connection.

Quick Start & Requirements

  • Primary install / run command: Download the Android APK from GitHub Releases for a quick setup. For source builds: git clone https://github.com/alichherawalla/off-grid-mobile.git, cd off-grid-mobile, npm install, then npm run android or npm run ios.
  • Non-default prerequisites and dependencies: Node.js 20+, JDK 17 / Android SDK 36 (Android), Xcode 15+ (iOS).
  • Links: GitHub Releases for APK, Slack Community.

Highlighted Details

  • Supports a wide range of GGUF LLMs and over 20 Stable Diffusion models.
  • Features on-device Stable Diffusion with real-time preview and NPU/Core ML acceleration.
  • Includes Vision AI for real-time scene description and document analysis directly from the camera.
  • On-device Whisper for secure, real-time voice-to-text transcription.
  • Native PDF text extraction for document analysis within conversations.
  • AI Prompt Enhancement automatically refines text prompts for image generation.

Maintenance & Community

The project maintains an active community via a Slack channel for user interaction and feedback. Contributions are welcomed through standard GitHub pull request workflows, with guidelines available for code style and patterns.

Licensing & Compatibility

The project is licensed under the MIT License, which is permissive and generally compatible with commercial use and closed-source linking. It is designed for Android and iOS platforms.

Limitations & Caveats

Performance metrics provided are based on flagship and mid-range devices and will vary significantly with different hardware, model sizes, and quantization levels. Building from source requires specific versions of development tools for both Android and iOS.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
55
Issues (30d)
24
Star History
635 stars in the last 27 days

Explore Similar Projects

Feedback? Help us improve.