clickclickclick  by instavm

Framework for autonomous Android/computer use via LLMs

Created 1 year ago
642 stars

Top 51.9% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This framework enables autonomous control of Android and macOS devices using Large Language Models (LLMs), allowing users to automate complex tasks through natural language prompts. It targets developers and power users seeking to integrate LLM capabilities into device automation workflows.

How It Works

The system leverages a dual-LLM approach: a "planner" LLM to break down tasks into actionable steps and a "finder" LLM to identify UI elements on the target device. It supports various LLMs, including OpenAI's GPT models, Google's Gemini, and local models via Ollama. The framework interacts with Android devices via ADB and macOS via internal scripting.

Quick Start & Requirements

  • Install via pip: pip install git+https://github.com/BandarLabs/clickclickclick.git
  • Prerequisites: adb installed, USB debugging enabled on Android, Python >= 3.11.
  • Configuration: Set API keys (e.g., OPENAI_API_KEY, GEMINI_API_KEY) in environment variables or config/models.yaml.
  • Usage: CLI (click3 run "task"), Gradio web UI (click3 gradio), or API (Uvicorn server).
  • Docs: https://github.com/BandarLabs/clickclickclick

Highlighted Details

  • Supports local LLMs via Ollama (e.g., Llama 3.2-vision).
  • Demonstrates task automation for Gmail drafting, web browsing, and game launching.
  • Offers both CLI and API interfaces for flexible integration.
  • Configurable planner and finder models for task execution.

Maintenance & Community

The project is actively developed by BandarLabs. Community contributions are encouraged via GitHub issues and pull requests.

Licensing & Compatibility

  • License: MIT.
  • Compatibility: Suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The project is described as "highly experimental" and may evolve significantly. Current macOS support is noted as not fully functional.

Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
116 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Gregor Zunic Gregor Zunic(Cofounder of Browser Use).

droidrun by droidrun

0.8%
7k
Framework for controlling Android devices via LLM agents
Created 9 months ago
Updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), and
5 more.

trae-agent by bytedance

0.7%
10k
LLM-powered CLI for software engineering tasks
Created 7 months ago
Updated 3 months ago
Feedback? Help us improve.