droidclaw  by unitedbyai

AI agents for Android automation

Created 2 weeks ago

New!

947 stars

Top 38.5% on SourcePulse

GitHubView on GitHub
Project Summary

This project transforms old Android phones into AI-powered agents capable of executing tasks based on plain English goals. It targets developers and power users seeking advanced, API-less mobile automation, offering a novel way to repurpose legacy devices for intelligent, on-device task completion.

How It Works

DroidClaw operates by reading the device's screen content via accessibility trees or screenshots. This information is fed to a Large Language Model (LLM) which determines the next action—tapping, typing, or swiping—executed through ADB. This iterative process, guided by the LLM's "think, plan, action" loop, allows for dynamic automation that adapts to UI changes and complex multi-app workflows without requiring explicit API integrations.

Quick Start & Requirements

The simplest setup uses a bash script: curl -fsSL https://droidclaw.ai/install.sh | sh. This installs necessary dependencies like bun (required, not Node/npm) and adb. Alternatively, manual installation involves brew install android-platform-tools (for adb), installing bun via curl, cloning the repository, running bun install, and copying .env.example to .env. Configuration requires setting an LLM_PROVIDER (e.g., groq, ollama, openai) and corresponding API keys or local model setup (ollama pull llama3.2). A phone with USB debugging enabled is essential. Remote control is possible via Tailscale.

Highlighted Details

  • Workflows vs. Flows: Supports AI-driven JSON workflows for dynamic, multi-app tasks and YAML flows for instant, fixed-sequence automation.
  • LLM Provider Flexibility: Integrates with Groq (free tier), Ollama (local), OpenRouter, OpenAI, and Bedrock, including vision models.
  • Remote Operation: Enables controlling devices globally via Tailscale, turning phones into always-on, remote agents.
  • Stuck Recovery & Vision Fallback: Includes mechanisms to detect and recover from stuck states and uses screenshots for UIs not accessible via tree dumps.

Maintenance & Community

Developed by unitedby.ai, an open AI community, with notable contributors including sanju sivalingam, somasundaram, and mahesh. The project's workflow orchestration was influenced by Android Action Kernel.

Licensing & Compatibility

The project is released under the MIT license, permitting broad use, modification, and distribution, including for commercial purposes and integration into closed-source applications.

Limitations & Caveats

DroidClaw is experimental and relies heavily on the LLM's interpretation and decision-making, which can lead to errors. Automation accuracy may vary depending on UI complexity and the chosen LLM's capabilities. Reliance on accessibility trees or screenshots means performance can be impacted by app-specific UI implementations or rapid visual changes.

Health Check
Last Commit

19 hours ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
5
Star History
965 stars in the last 19 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Gregor Zunic Gregor Zunic(Cofounder of Browser Use).

droidrun by droidrun

0.9%
8k
Framework for controlling Android devices via LLM agents
Created 10 months ago
Updated 5 days ago
Feedback? Help us improve.