AppAgentX  by Westlake-AGI-Lab

GUI agent research paper implementation

created 5 months ago
481 stars

Top 64.6% on sourcepulse

GitHubView on GitHub
Project Summary

AppAgentX introduces an evolutionary framework for GUI agents, aiming to improve the efficiency of LLM-based smartphone user agents without sacrificing intelligence or flexibility. It targets researchers and developers building LLM agents for mobile applications, offering a novel approach to automate complex tasks by learning and replacing repetitive low-level actions with high-level shortcuts.

How It Works

AppAgentX employs a memory mechanism to store task execution history. By analyzing this history, the agent identifies and learns to replace sequences of repetitive low-level operations with high-level "shortcut" actions. This evolutionary process enhances operational efficiency, allowing the agent to focus its reasoning capabilities on novel or complex tasks while streamlining routine operations. The framework leverages LangChain and LangGraph for agent construction, Neo4j for memory storage, and Pinecone for vector storage.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt
  • Prerequisites:
    • Python (version not specified, but LangChain/LangGraph imply recent versions)
    • Neo4j database
    • Pinecone API keys
    • Docker with GPU support for screen recognition/feature extraction services.
    • Android Debug Bridge (ADB) configured for device connection or an Android emulator via Android Studio.
  • Launch Demo: python demo.py or gradio demo.py
  • Links: Neo4j, Pinecone, LangChain, LangGraph, ADB, Android Studio Emulator

Highlighted Details

  • Evolutionary framework for GUI agents to improve efficiency.
  • Memory mechanism for learning and creating high-level action shortcuts.
  • Modular design for screen recognition and feature extraction services.
  • Supports interaction with physical Android devices or emulators.

Maintenance & Community

The project is associated with Westlake University's AGI Lab. Contact information for Wenjia Jiang is provided. No specific community channels (Discord/Slack) or roadmap links are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. It mentions the code will be open-sourced, but specific terms, restrictions, or compatibility for commercial use are not detailed.

Limitations & Caveats

The project is presented as an official implementation of a research paper, suggesting it may be experimental. Docker with GPU support is required for backend services, which could be a barrier for some users. Specific Python version requirements are not detailed.

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
142 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Toran Bruce Richards Toran Bruce Richards(Founder of AutoGPT), and
2 more.

OS-Copilot by OS-Copilot

0.1%
2k
OS agent for automating daily tasks
created 1 year ago
updated 10 months ago
Feedback? Help us improve.