cactus  by cactus-compute

Framework for on-device AI, targeting mobile and wearables

created 3 months ago
2,484 stars

Top 19.2% on sourcepulse

GitHubView on GitHub
Project Summary

Cactus is a C++ framework designed for efficient AI model execution on mobile and wearable devices, targeting developers building cross-platform applications. It offers hardware-aware optimization and bindings for popular mobile development frameworks like Flutter and React Native, enabling on-device AI capabilities with a low memory footprint and battery efficiency.

How It Works

Cactus leverages the GGML/GGUF ecosystem, specifically integrating with Llama.cpp, to run a wide variety of AI models, including LLMs, VLMs, and TTS models. This approach allows it to support any model compatible with the GGUF format, providing a unified backend for diverse AI tasks. The framework is built with a C++ core, ensuring performance, and offers wrappers for easy integration into mobile applications.

Quick Start & Requirements

  • Install: Add cactus to pubspec.yaml for Flutter (flutter pub get) or run npm install cactus-react-native / yarn add cactus-react-native for React Native. For native iOS projects, run npx pod-install in the ios directory.
  • Prerequisites: CMake is required for C++ backend compilation. Examples include downloading specific models (e.g., Qwen 3, SmolVLM, OuteTTS).
  • Resources: Setup involves adding dependencies and potentially downloading models.
  • Docs: Deep Wiki, C++ Docs, Flutter Docs, React-Native Docs.
  • Examples: Ready-to-deploy apps for Flutter Chat, Flutter Notes, React Chat, React Productivity, React Diary, C++ LLM, C++ VLM, C++ TTS.

Highlighted Details

  • Supports text completion, chat completion, VLM, streaming token generation, embedding generation, and early-stage TTS.
  • Features include JSON mode with schema validation, Jinja2 chat templates, background processing, and agentic workflows.
  • Benchmarks show token generation speeds up to 103 tokens/sec on an iPhone 16 Pro Max for SmolLM2 360M Q8.
  • Provides higher-level APIs for sentiment analysis, OCR, and TTS.

Maintenance & Community

  • Actively welcomes contributions via pull requests. Testing procedures for Flutter/React-Native are noted as upcoming.
  • Discord and Twitter.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.

Limitations & Caveats

  • The framework is in early stages, with TTS support being in "early stages" and testing procedures for Flutter/React-Native pending updates. The README mentions that Deep Wiki documentation may not keep up with the rapid update speed.
Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
11
Issues (30d)
16
Star History
2,506 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.