agent-browser  by vercel-labs

Browser automation CLI for AI agents

Created 2 weeks ago

New!

10,573 stars

Top 4.8% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project provides a command-line interface (CLI) for browser automation, specifically designed to be used by AI agents. It offers a fast, deterministic, and AI-friendly way to control web browsers, enabling agents to interact with web pages programmatically. The primary benefit is a streamlined workflow for AI-driven web tasks.

How It Works

The tool employs a client-daemon architecture, featuring a fast Rust CLI that communicates with a Node.js daemon managing the browser instance via Playwright. Its core innovation lies in a snapshot command that generates an accessibility tree with unique, stable refs (e.g., @e1). Subsequent actions like clicking or filling are performed using these refs, ensuring deterministic element selection that is highly compatible with AI parsing and decision-making. It also supports traditional CSS selectors and XPath for broader compatibility.

Quick Start & Requirements

  • Installation: Install globally via npm: npm install -g agent-browser. Download the default Chromium browser with agent-browser install. On Linux, system dependencies can be installed with agent-browser install --with-deps.
  • Prerequisites: Node.js and npm/pnpm are required. Playwright handles browser binary management.
  • Links: No specific demo or documentation links are provided beyond the CLI commands themselves.

Highlighted Details

  • AI-Optimized Workflow: Utilizes snapshot with refs for AI agents to reliably identify and interact with elements.
  • Semantic Locators: Offers robust element finding via ARIA roles, labels, text content, and other semantic attributes, reducing reliance on brittle CSS selectors.
  • Isolated Sessions: Supports running multiple, independent browser instances using the --session flag or AGENT_BROWSER_SESSION environment variable, each with its own cookies and history.
  • Performance: Features a native Rust CLI for speed, with a Node.js fallback.

Maintenance & Community

No specific details regarding maintainers, community channels (e.g., Discord, Slack), sponsorships, or roadmap were present in the provided README.

Licensing & Compatibility

  • License: Apache-2.0.
  • Compatibility: The Apache-2.0 license is permissive and generally compatible with commercial use and linking within closed-source projects.

Limitations & Caveats

The tool is a CLI, primarily intended for programmatic use by agents or scripts. While it supports multiple browsers via Playwright, Chromium is the default. The README does not detail specific limitations, known bugs, or unsupported platforms beyond the architecture's platform support matrix.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
151
Issues (30d)
126
Star History
10,658 stars in the last 16 days

Explore Similar Projects

Feedback? Help us improve.