flutter_gemma  by DenisovAV

Local AI inference SDK for Flutter apps

Created 1 year ago
270 stars

Top 95.1% on SourcePulse

GitHubView on GitHub
Project Summary

This Flutter plugin enables on-device execution of various large language models, including Gemma, TinyLlama, Phi, and others, directly within iOS, Android, and Web applications. It targets Flutter developers seeking to integrate advanced AI capabilities like text generation, multimodal input, and function calling into their applications without relying on external servers, thereby enhancing user privacy and enabling offline functionality.

How It Works

The plugin leverages MediaPipe and LiterTLM formats for efficient on-device inference. It manages the download, storage, and initialization of diverse LLM models, supporting both CPU and GPU backends. Key features include multimodal (text + image) input for specific models like Gemma 3 Nano, function calling capabilities for integrating with external services, and a "thinking mode" for DeepSeek models to expose their reasoning process. It also supports LoRA weights for efficient model fine-tuning and provides robust download management with retry logic.

Quick Start & Requirements

  • Installation: Add flutter_gemma: latest_version to pubspec.yaml and run flutter pub get.
  • Prerequisites: Flutter SDK. Models must be downloaded separately (e.g., from Kaggle or HuggingFace).
  • Platform-Specific Setup:
    • iOS: Requires platform :ios, '16.0', file sharing enabled, local network usage description, memory entitlements (com.apple.developer.kernel.extended-virtual-addressing, etc.), and static pod linkage (use_frameworks! :linkage => :static).
    • Android: OpenGL support required for GPU acceleration.
    • Web: Currently supports only GPU backend models.
  • Resource Footprint: Varies significantly based on the model size; larger models require substantial device resources.

Highlighted Details

  • Extensive Model Support: Integrates a wide array of models including Gemma (2B, 7B, 3 Nano, 3 1B, 3 270M), TinyLlama, Hammer, Llama, Phi (2, 3, 4), DeepSeek, Qwen2.5, Falcon, and StableLM.
  • Multimodal Capabilities: Supports text and image input with Gemma 3 Nano vision models.
  • Function Calling: Enables models to execute external functions, supported by Gemma 3 Nano, Hammer 2.1, DeepSeek, and Qwen2.5.
  • Text Embeddings: Includes support for generating text embeddings via EmbeddingGemma and Gecko models.
  • Reliable Downloads: Features smart retry logic, ETag handling, and automatic resume/restart for model downloads.

Maintenance & Community

The provided README does not contain specific details regarding maintainers, community channels (like Discord/Slack), or project roadmaps.

Licensing & Compatibility

The license for this plugin is not explicitly stated in the provided README. Compatibility for commercial use or linking with closed-source applications would depend on the underlying model licenses and the plugin's own license, which requires clarification.

Limitations & Caveats

Web platform support is restricted to GPU backend models. Larger models (e.g., 7B parameters) may be too resource-intensive for typical on-device inference. Multimodal models require significant memory (8GB+ RAM recommended) and specific iOS configurations (iOS 16.0+, memory entitlements). Function calling and thinking mode are only available for specific, supported models. The plugin's license is not specified.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
5
Issues (30d)
13
Star History
22 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Gabriel Almeida Gabriel Almeida(Cofounder of Langflow), and
2 more.

torchchat by pytorch

0.1%
4k
PyTorch-native SDK for local LLM inference across diverse platforms
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.