droidrun  by droidrun

Framework for controlling Android devices via LLM agents

created 3 months ago
3,741 stars

Top 13.2% on sourcepulse

GitHubView on GitHub
Project Summary

DroidRun is a framework for automating Android device interactions using natural language commands powered by Large Language Models (LLMs). It targets developers and power users seeking to automate repetitive tasks, perform UI testing, or provide remote assistance on Android devices, offering a flexible Python API and a user-friendly CLI.

How It Works

DroidRun leverages a ReAct (Reasoning and Acting) agent architecture. The agent receives a natural language task, uses an LLM to break it down into actionable steps, and then executes these steps on the Android device via the DroidRun Portal app. This approach allows for complex task execution and visual understanding of the device state through screenshot analysis.

Quick Start & Requirements

  • Install: pip install droidrun
  • Prerequisites: Android device connected via USB/TCP/IP, ADB installed, DroidRun Portal app installed on device, LLM provider API key (OpenAI, Anthropic, Gemini).
  • Setup: Install DroidRun Portal APK (droidrun setup --path=...), configure API keys (e.g., export OPENAI_API_KEY="..."), connect device (droidrun connect <ip>), verify status (droidrun status).
  • Docs: https://github.com/droidrun/droidrun

Highlighted Details

  • Supports multiple LLM providers: OpenAI, Anthropic, Gemini.
  • CLI for direct command execution and Python API for custom automations.
  • Includes screenshot analysis for visual understanding of the device state.
  • Demo videos showcase use cases like shopping assistance and social media automation.

Maintenance & Community

The project is actively maintained by the droidrun organization. Contributions are welcome via Pull Requests.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The roadmap indicates planned improvements for memory, vision capabilities, and integrations with other agent frameworks like LangChain. A hosted version and app store are also planned.

Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
8
Issues (30d)
22
Star History
1,598 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.