Discover and explore top open-source AI tools and projects—updated daily.
THUDMAndroid autonomous agent training and benchmarking framework
Top 99.9% on SourcePulse
Summary
AndroidLab provides a systematic framework for training and benchmarking autonomous agents on Android devices. It addresses the need for reproducible evaluation of AI agents in complex mobile environments. The project offers a comprehensive benchmark suite and an operation environment, benefiting researchers and developers aiming to build and assess sophisticated Android agents.
How It Works
The framework comprises an operation environment and a reproducible benchmark featuring 138 tasks across nine distinct Android applications. These apps, including Bluecoins, Calendar, and Maps.me, are selected for their offline functionality to ensure consistent and reliable testing conditions. AndroidLab supports two execution modes: AVD on Mac (arm64) and Docker on Linux (x86_64). This approach allows for systematic evaluation and training, enabling open-source models to achieve performance levels comparable to proprietary agents through instruction tuning on the provided Android Instruct dataset.
Quick Start & Requirements
Installation involves cloning the repository, creating a Python 3.11 Conda environment, and installing dependencies via pip install -r requirements.txt. Users must set up either AVD on Mac (arm64) or Docker on Linux (x86_64) following guides linked within the README. Each concurrent session requires approximately 6GB of memory and 9GB of storage.
Highlighted Details
Maintenance & Community
No specific details regarding maintainers, community channels (e.g., Discord, Slack), or roadmap were found in the provided README excerpt.
Licensing & Compatibility
The license type and any compatibility notes for commercial use or closed-source linking are not specified in the provided README excerpt.
Limitations & Caveats
The framework's primary execution environments are limited to AVD on Mac (arm64) and Docker on Linux (x86_64). Evaluation processes require API keys for specific judge models like GPT-4o or GLM4.
3 months ago
1+ week
TheAgentCompany
xlang-ai