SmolChat-Android by shubham0204

Android app for local SLM/LLM inference

Created 1 year ago

639 stars

Top 52.1% on SourcePulse

View on GitHub

2 Experts Love This Project

Georgi Gerganov

Author of llama.cpp, whisper.cpp

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

SmolChat-Android enables on-device inference of GGUF-formatted Large Language Models (LLMs) directly on Android devices. It targets users who want to run AI models locally for privacy, offline use, or experimentation, providing a straightforward interface for interacting with these models.

How It Works

The application leverages the ggerganov/llama.cpp C++ library, which is compiled for Android using the NDK. A JNI binding (smollm.cpp) facilitates communication between the Kotlin-based Android application and the C++ inference engine. This approach allows for efficient execution of LLMs on mobile hardware by utilizing llama.cpp's optimized inference capabilities.

Quick Start & Requirements

Installation: Download the latest APK from GitHub Releases or use Obtainium with the URL https://github.com/shubham0204/SmolChat-Android.
Development Setup: Clone the repository (git clone --depth=1 --recurse-submodules https://github.com/shubham0204/SmolChat-Android) and build with Android Studio.
Dependencies: Requires Android device, Android Studio for development. GGUF model files are needed for inference.

Highlighted Details

Supports GGUF model format via llama.cpp.
Utilizes Markwon and Prism4j for Markdown rendering of LLM responses.
Includes JNI bindings for C++ integration.
Project aims for an extensible codebase for custom downstream tasks.

Maintenance & Community

The project is maintained by shubham0204. Further community engagement details (Discord, Slack, etc.) are not specified in the README.

Licensing & Compatibility

The project appears to be licensed under the MIT License, allowing for commercial use and integration with closed-source applications.

Limitations & Caveats

The README mentions potential future integration with Vulkan for GPU acceleration, implying current inference may be CPU-bound. Some planned features like automatic chat naming and background services are not yet implemented.

Health Check

Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

38 stars in the last 30 days