Awesome-GUI-Agents  by ZJU-REAL

A curated collection for developing advanced GUI agents

Created 11 months ago
305 stars

Top 88.0% on SourcePulse

GitHubView on GitHub
Project Summary

Summary This repository curates resources, tools, and frameworks for developing GUI Agents, targeting AI researchers and developers. It serves as a centralized hub for the latest advancements, papers, datasets, and benchmarks in intelligent agent interaction with graphical user interfaces, aiming to accelerate R&D.

How It Works The project functions as an organized collection, not an executable framework. It categorizes information around the four core GUI Agent modules: perception, exploration, planning, and interaction. Its primary value lies in extensive, chronologically updated lists of research papers, datasets, and benchmarks, covering reinforcement learning, vision-language models, and multimodal agents across desktop, web, and mobile platforms.

Quick Start & Requirements As a curated list, direct installation or execution is not applicable. Users must refer to individual linked papers, GitHub projects, and Hugging Face models for specific implementations, each with its own prerequisites (e.g., Python, libraries, GPU). Relevant links are provided throughout.

Highlighted Details

  • Comprehensive, regularly updated lists of research papers categorized by sub-fields like GUI Grounding, Reinforcement Learning, and Benchmarks.
  • Covers a broad spectrum of GUI agent applications: desktop, web, and mobile environments.
  • Features summaries and analyses of recent AAAI and ICLR accepted papers, highlighting cutting-edge research.
  • Includes links to open-sourced models (e.g., GUI-G2-3B, GUI-G2-7B) and project pages for specific agents.

Maintenance & Community The repository is actively maintained with frequent updates, including new research papers and weekly summaries. Contributions are welcomed via pull requests. No specific community channels or dedicated maintainer information are provided.

Licensing & Compatibility No software license is explicitly stated in the provided text. This absence prevents determining compatibility for commercial use or closed-source linking without further investigation.

Limitations & Caveats This is a curated list, not a runnable framework, requiring users to integrate individual components. The lack of explicit licensing is a significant adoption blocker. The rapid pace of research means information may require verification against primary sources.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
2 more.

Agent-S by simular-ai

0.8%
10k
Agentic framework for autonomous computer interaction
Created 1 year ago
Updated 4 days ago
Feedback? Help us improve.