Benchmark for autonomous agents on Android
Top 77.8% on sourcepulse
AndroidWorld is an environment and benchmark for developing and evaluating autonomous agents that control Android devices. It offers a reproducible benchmark of 116 hand-crafted tasks across 20 apps, with millions of dynamic variations, targeting AI researchers and developers building agents for mobile platforms.
How It Works
AndroidWorld operates on a live Android emulator, providing agents with screenshots and UI element information to perform actions. It supports millions of task variations through dynamic instantiation of parameters within tasks. The environment includes durable reward signals for reliable evaluation and integrates with the MiniWoB++ web benchmark, rendering web elements as native Android UI widgets for a unified interaction model.
Quick Start & Requirements
ffmpeg
, and an Android emulator setup with a specific AVD configuration (Pixel 6, API Level 33, named AndroidWorldAvd
).pip install -r requirements.txt
), and set API keys as environment variables.python minimal_task_runner.py
for a basic test or python run.py
for benchmarks. The --perform_emulator_setup
flag is required for initial app installation and permissions.Highlighted Details
EnvironmentInteractingAgent
.Maintenance & Community
This project is from google-research. It is noted as "not an officially supported Google product."
Licensing & Compatibility
The repository does not explicitly state a license in the README. This requires further investigation for commercial use or closed-source linking.
Limitations & Caveats
The README does not specify a license, which may pose a barrier to commercial adoption or integration with proprietary systems. Initial setup of the Android emulator and AVD requires careful adherence to specific instructions.
1 day ago
Inactive