droidclaw  by unitedbyai

AI agents for Android automation

Created 2 months ago
1,311 stars

Top 30.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project transforms old Android phones into AI-powered agents capable of executing tasks based on plain English goals. It targets developers and power users seeking advanced, API-less mobile automation, offering a novel way to repurpose legacy devices for intelligent, on-device task completion.

How It Works

DroidClaw operates by reading the device's screen content via accessibility trees or screenshots. This information is fed to a Large Language Model (LLM) which determines the next action—tapping, typing, or swiping—executed through ADB. This iterative process, guided by the LLM's "think, plan, action" loop, allows for dynamic automation that adapts to UI changes and complex multi-app workflows without requiring explicit API integrations.

Quick Start & Requirements

The simplest setup uses a bash script: curl -fsSL https://droidclaw.ai/install.sh | sh. This installs necessary dependencies like bun (required, not Node/npm) and adb. Alternatively, manual installation involves brew install android-platform-tools (for adb), installing bun via curl, cloning the repository, running bun install, and copying .env.example to .env. Configuration requires setting an LLM_PROVIDER (e.g., groq, ollama, openai) and corresponding API keys or local model setup (ollama pull llama3.2). A phone with USB debugging enabled is essential. Remote control is possible via Tailscale.

Highlighted Details

  • Workflows vs. Flows: Supports AI-driven JSON workflows for dynamic, multi-app tasks and YAML flows for instant, fixed-sequence automation.
  • LLM Provider Flexibility: Integrates with Groq (free tier), Ollama (local), OpenRouter, OpenAI, and Bedrock, including vision models.
  • Remote Operation: Enables controlling devices globally via Tailscale, turning phones into always-on, remote agents.
  • Stuck Recovery & Vision Fallback: Includes mechanisms to detect and recover from stuck states and uses screenshots for UIs not accessible via tree dumps.

Maintenance & Community

Developed by unitedby.ai, an open AI community, with notable contributors including sanju sivalingam, somasundaram, and mahesh. The project's workflow orchestration was influenced by Android Action Kernel.

Licensing & Compatibility

The project is released under the MIT license, permitting broad use, modification, and distribution, including for commercial purposes and integration into closed-source applications.

Limitations & Caveats

DroidClaw is experimental and relies heavily on the LLM's interpretation and decision-making, which can lead to errors. Automation accuracy may vary depending on UI complexity and the chosen LLM's capabilities. Reliance on accessibility trees or screenshots means performance can be impacted by app-specific UI implementations or rapid visual changes.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
2
Star History
169 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Gregor Zunic Gregor Zunic(Cofounder of Browser Use).

droidrun by droidrun

0.6%
8k
Framework for controlling Android devices via LLM agents
Created 1 year ago
Updated 4 days ago
Feedback? Help us improve.