cactus  by cactus-compute

Framework for on-device AI, targeting mobile and wearables

Created 8 months ago
4,043 stars

Top 12.0% on SourcePulse

GitHubView on GitHub
Project Summary

Cactus is a C++ framework designed for efficient AI model execution on mobile and wearable devices, targeting developers building cross-platform applications. It offers hardware-aware optimization and bindings for popular mobile development frameworks like Flutter and React Native, enabling on-device AI capabilities with a low memory footprint and battery efficiency.

How It Works

Cactus leverages the GGML/GGUF ecosystem, specifically integrating with Llama.cpp, to run a wide variety of AI models, including LLMs, VLMs, and TTS models. This approach allows it to support any model compatible with the GGUF format, providing a unified backend for diverse AI tasks. The framework is built with a C++ core, ensuring performance, and offers wrappers for easy integration into mobile applications.

Quick Start & Requirements

  • Install: Add cactus to pubspec.yaml for Flutter (flutter pub get) or run npm install cactus-react-native / yarn add cactus-react-native for React Native. For native iOS projects, run npx pod-install in the ios directory.
  • Prerequisites: CMake is required for C++ backend compilation. Examples include downloading specific models (e.g., Qwen 3, SmolVLM, OuteTTS).
  • Resources: Setup involves adding dependencies and potentially downloading models.
  • Docs: Deep Wiki, C++ Docs, Flutter Docs, React-Native Docs.
  • Examples: Ready-to-deploy apps for Flutter Chat, Flutter Notes, React Chat, React Productivity, React Diary, C++ LLM, C++ VLM, C++ TTS.

Highlighted Details

  • Supports text completion, chat completion, VLM, streaming token generation, embedding generation, and early-stage TTS.
  • Features include JSON mode with schema validation, Jinja2 chat templates, background processing, and agentic workflows.
  • Benchmarks show token generation speeds up to 103 tokens/sec on an iPhone 16 Pro Max for SmolLM2 360M Q8.
  • Provides higher-level APIs for sentiment analysis, OCR, and TTS.

Maintenance & Community

  • Actively welcomes contributions via pull requests. Testing procedures for Flutter/React-Native are noted as upcoming.
  • Discord and Twitter.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.

Limitations & Caveats

  • The framework is in early stages, with TTS support being in "early stages" and testing procedures for Flutter/React-Native pending updates. The README mentions that Deep Wiki documentation may not keep up with the rapid update speed.
Health Check
Last Commit

11 hours ago

Responsiveness

1 day

Pull Requests (30d)
21
Issues (30d)
3
Star History
183 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory).

AstrBot by AstrBotDevs

1.1%
15k
LLM chatbot/framework for multiple platforms
Created 3 years ago
Updated 2 hours ago
Feedback? Help us improve.