Discover and explore top open-source AI tools and projects—updated daily.
lahfirDesktop automation CLI for AI agents
Top 42.2% on SourcePulse
Summary
Agent Desktop provides a native command-line interface (CLI) built with Rust for AI agents to automate desktop applications. It offers structured, deterministic control over any application by leveraging OS accessibility trees, bypassing brittle methods like pixel matching or screenshots. This enables AI agents to reliably interact with the desktop environment, significantly reducing token usage for dense applications through optimized data representation.
How It Works
The core of agent-desktop is a fast, single-binary Rust CLI that interacts with applications via their native accessibility APIs. It outputs structured JSON, providing machine-readable responses with error codes and recovery hints. A key innovation is its use of deterministic element references (e.g., @e1) derived from accessibility tree snapshots, allowing AI agents to target UI elements reliably across interactions. For AI optimization, it employs progressive skeleton traversal, generating shallow overviews that can be drilled down into, drastically reducing token consumption. A C-ABI dynamic library (cdylib) facilitates in-process integration with languages like Python, Swift, and Go, avoiding the overhead of forking the CLI process per command.
Quick Start & Requirements
Installation is recommended via npm: npm install -g agent-desktop, which automatically downloads a prebuilt binary. Alternatively, it can be built from source using cargo build --release (requires Rust 1.78+). macOS 13.0+ is required, and Accessibility permissions must be granted via System Settings or by running agent-desktop permissions --request. Prebuilt C-ABI cdylib artifacts are available per release for cross-platform FFI integration.
Highlighted Details
@e1, @e2) generated from accessibility snapshots for reliable AI-driven interactions.cdylib allows direct, efficient calls from various programming languages without process forking.Maintenance & Community
The repository is hosted on GitHub at https://github.com/lahfir/agent-desktop. Specific details regarding community channels, active maintainers, or project roadmap are not explicitly detailed in the provided README.
Licensing & Compatibility
Agent Desktop is licensed under the Apache-2.0 license. This permissive license allows for commercial use and integration into closed-source projects.
Limitations & Caveats
Full platform support, including accessibility tree interaction, is currently limited to macOS. Support for Windows and Linux is marked as "Planned," indicating these platforms may not be fully functional or may lack certain features in the current release.
2 days ago
Inactive
DevAgentForge