clickclickclick by instavm

Framework for autonomous Android/computer use via LLMs

Created 1 year ago

642 stars

Top 51.9% on SourcePulse

1 Expert Loves This Project

hugs

Creator of Selenium

Project Summary

This framework enables autonomous control of Android and macOS devices using Large Language Models (LLMs), allowing users to automate complex tasks through natural language prompts. It targets developers and power users seeking to integrate LLM capabilities into device automation workflows.

How It Works

The system leverages a dual-LLM approach: a "planner" LLM to break down tasks into actionable steps and a "finder" LLM to identify UI elements on the target device. It supports various LLMs, including OpenAI's GPT models, Google's Gemini, and local models via Ollama. The framework interacts with Android devices via ADB and macOS via internal scripting.

Quick Start & Requirements

Install via pip: pip install git+https://github.com/BandarLabs/clickclickclick.git
Prerequisites: adb installed, USB debugging enabled on Android, Python >= 3.11.
Configuration: Set API keys (e.g., OPENAI_API_KEY, GEMINI_API_KEY) in environment variables or config/models.yaml.
Usage: CLI (click3 run "task"), Gradio web UI (click3 gradio), or API (Uvicorn server).
Docs: https://github.com/BandarLabs/clickclickclick

Highlighted Details

Supports local LLMs via Ollama (e.g., Llama 3.2-vision).
Demonstrates task automation for Gmail drafting, web browsing, and game launching.
Offers both CLI and API interfaces for flexible integration.
Configurable planner and finder models for task execution.

Maintenance & Community

The project is actively developed by BandarLabs. Community contributions are encouraged via GitHub issues and pull requests.

Licensing & Compatibility

License: MIT.
Compatibility: Suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The project is described as "highly experimental" and may evolve significantly. Current macOS support is noted as not fully functional.

Health Check

Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

116 stars in the last 30 days

Explore Similar Projects

wren by bjesus

CLI tool for simple, extensible task management

Created 2 years ago

Updated 8 months ago

AutoDroid by MobileLLM

Smartphone automation via LLM research paper

Created 2 years ago

Updated 1 year ago

open-assistant-api by MLT-OSS

Open-source API for AI assistant/GPT orchestration

Created 2 years ago

Updated 6 months ago

lecca-io by lecca-digital

AI platform for building and deploying LLM-powered agents

Created 1 year ago

Updated 7 months ago

Dive by OpenAgentPlatform

Desktop app for LLM function calling

Created 11 months ago

Updated 2 days ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

macOS-use by browser-use

AI agent for macOS app automation

Created 11 months ago

Updated 10 months ago

pippin by pippinlovesyou

Digital Being framework for autonomous agents

Created 1 year ago

Updated 10 months ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

gptscript by gptscript-ai

Framework for building AI assistants that interact with systems

Created 1 year ago

Updated 3 weeks ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and

Gregor Zunic

Gregor Zunic(Cofounder of Browser Use).

droidrun by droidrun

Framework for controlling Android devices via LLM agents

Created 9 months ago

Updated 1 day ago

Starred by

Didier Lopes

Didier Lopes(Founder of OpenBB).

julep by julep-ai

Serverless platform for AI workflow deployment

Created 1 year ago

Updated 1 week ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory), and

5 more.

trae-agent by bytedance

LLM-powered CLI for software engineering tasks

Created 7 months ago

Updated 3 months ago

Starred by

Theo Browne

Theo Browne(Founder of Ping.gg),

Hiroshi Shibata

Hiroshi Shibata(Core Contributor to Ruby), and

25 more.

goose by block

Open-source AI agent for automating complex engineering tasks

Created 1 year ago

Updated 1 day ago

Feedback? Help us improve.