agent-device  by callstackincubator

CLI for AI agents to automate mobile, TV, and desktop app interactions

Created 3 months ago
2,145 stars

Top 20.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Summary

agent-device is a CLI tool enabling AI agents to control mobile, TV, and desktop apps. It facilitates real-world app interaction, UI inspection, and evidence collection using token-efficient accessibility snapshots, not pixel screenshots. This empowers AI agents for automated QA, development, and testing, closing the loop from code generation to verified execution and feedback.

How It Works

The tool leverages accessibility snapshots for compact UI trees, allowing agents to interact via element references (e.g., @e3). It supports actions like touch and text input, plus evidence capture (screenshots, video, logs, performance metrics) triggered on demand. It integrates with platform backends like XCTest (iOS/tvOS) and ADB (Android) for unified device automation.

Quick Start & Requirements

Install via npm install -g agent-device@latest. Prerequisites include Node.js 22+, Xcode (Apple targets), Android SDK + ADB (Android), and macOS Accessibility permissions (desktop). See agent-device.dev for details.

Highlighted Details

  • Broad Platform Support: Automates iOS, Android, tvOS, Android TV, macOS, and Linux on real devices and simulators.
  • Agent-Native UI Model: Uses accessibility snapshots and element refs for efficient UI inspection and interaction, with selectors for durable replay.
  • Comprehensive Evidence Capture: Collects screenshots, video, logs, network traffic, performance data, crash logs, and React render profiles.
  • Replayable Workflows: Generates .ad replay scripts for CI/CD and local execution, alongside e2e tests and debugging artifacts.
  • React Native/Expo Focus: Inspects component trees, props, state, hooks, and profiles React Native apps.

Maintenance & Community

Developed by Callstack. Contributing guidelines are available via CONTRIBUTING.md. Project resources are at agent-device.dev.

Licensing & Compatibility

Released under the permissive MIT license, allowing free use in commercial and closed-source projects.

Limitations & Caveats

Known limitations are documented separately. Critical setup steps include macOS Accessibility permissions for desktop automation. The tool's focus is agentic workflows, implying potential areas for refinement in human-centric usability.

Health Check
Last Commit

16 hours ago

Responsiveness

Inactive

Pull Requests (30d)
98
Issues (30d)
30
Star History
337 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.