clickclickclick  by instavm

Framework for autonomous Android/computer use via LLMs

Created 1 year ago
661 stars

Top 50.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This framework enables autonomous control of Android and macOS devices using Large Language Models (LLMs), allowing users to automate complex tasks through natural language prompts. It targets developers and power users seeking to integrate LLM capabilities into device automation workflows.

How It Works

The system leverages a dual-LLM approach: a "planner" LLM to break down tasks into actionable steps and a "finder" LLM to identify UI elements on the target device. It supports various LLMs, including OpenAI's GPT models, Google's Gemini, and local models via Ollama. The framework interacts with Android devices via ADB and macOS via internal scripting.

Quick Start & Requirements

  • Install via pip: pip install git+https://github.com/BandarLabs/clickclickclick.git
  • Prerequisites: adb installed, USB debugging enabled on Android, Python >= 3.11.
  • Configuration: Set API keys (e.g., OPENAI_API_KEY, GEMINI_API_KEY) in environment variables or config/models.yaml.
  • Usage: CLI (click3 run "task"), Gradio web UI (click3 gradio), or API (Uvicorn server).
  • Docs: https://github.com/BandarLabs/clickclickclick

Highlighted Details

  • Supports local LLMs via Ollama (e.g., Llama 3.2-vision).
  • Demonstrates task automation for Gmail drafting, web browsing, and game launching.
  • Offers both CLI and API interfaces for flexible integration.
  • Configurable planner and finder models for task execution.

Maintenance & Community

The project is actively developed by BandarLabs. Community contributions are encouraged via GitHub issues and pull requests.

Licensing & Compatibility

  • License: MIT.
  • Compatibility: Suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The project is described as "highly experimental" and may evolve significantly. Current macOS support is noted as not fully functional.

Health Check
Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

macOS-use by browser-use

0.8%
2k
AI agent for macOS app automation
Created 1 year ago
Updated 11 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Gregor Zunic Gregor Zunic(Cofounder of Browser Use).

droidrun by droidrun

0.9%
8k
Framework for controlling Android devices via LLM agents
Created 10 months ago
Updated 5 days ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), and
5 more.

trae-agent by bytedance

0.2%
11k
LLM-powered CLI for software engineering tasks
Created 8 months ago
Updated 2 weeks ago
Starred by Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
27 more.

goose by block

1.6%
31k
Open-source AI agent for automating complex engineering tasks
Created 1 year ago
Updated 21 hours ago
Feedback? Help us improve.