OS-Agent-Survey  by OS-Agent-Survey

Survey paper on OS Agents using MLLMs for computer, phone, and browser automation

created 7 months ago
313 stars

Top 87.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive survey of OS Agents, which are Large Multimodal Model (LMM)-based agents designed to automate tasks across computers, phones, and browsers by interacting with their interfaces. It serves as a valuable resource for researchers and developers in the rapidly evolving field of AI agents for operating system interaction.

How It Works

The survey categorizes and details existing research in OS Agents, covering foundation models, agent frameworks, evaluation benchmarks, and safety/privacy considerations. It consolidates the state-of-the-art, offering insights into methodologies, challenges, and future directions for building and deploying these agents.

Quick Start & Requirements

This repository is a curated list of papers and resources, not a runnable software agent. No installation or specific requirements are needed to access the information.

Highlighted Details

  • Extensive tables categorize recent foundation models, agent frameworks, and evaluation benchmarks for OS Agents.
  • Includes a "Full List" section with chronological updates on new research papers in the field.
  • Details hiring opportunities with OPPO's Personal AI Team for roles in multimodal LLMs and AI Agents.
  • Provides links to related GitHub repositories and resources for further community engagement.

Maintenance & Community

The repository is actively updated, with the last update noted as December 13, 2024. Contact information is provided for suggestions and corrections.

Licensing & Compatibility

The repository itself does not specify a license. The content is presented for informational and research purposes.

Limitations & Caveats

The paper associated with this repository was notably rejected by arXiv for not containing "sufficient original or substantive scholarly research," a decision the authors contest. Access to the paper is currently limited to the GitHub repository or OpenReview Archive.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
54 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.