Smartphone automation via LLM research paper
Top 75.9% on sourcepulse
AutoDroid enables Large Language Models (LLMs) to automate intelligent tasks on smartphones by leveraging their UI elements and natural language task descriptions. It is designed for researchers and developers working on AI-driven mobile automation and aims to provide a framework for LLM-powered smartphone interaction.
How It Works
AutoDroid builds upon the DroidBot framework, integrating LLMs to interpret tasks and interact with Android applications. It captures UI states (screenshots and view hierarchies) and uses these, along with task descriptions, to prompt an LLM for actions. The LLM's output is then translated into executable commands to navigate and operate the mobile application.
Quick Start & Requirements
git clone git@github.com:MobileLLM/AutoDroid.git && cd AutoDroid/ && pip install -e .
platform_tools
in PATH).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The current implementation struggles with task completion determination and exhibits unstable automation performance due to LLM randomness and GUI variations. It requires an ADB-connected host machine, not a standalone on-device solution. The project warns of potential unintended actions like account modifications.
1 year ago
Inactive