browser-harness by browser-use

LLM browser automation agent

Created 1 month ago

14,597 stars

Top 3.6% on SourcePulse

View on GitHub

8 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Joe Walnes

Head of Experimental Projects at Stripe

Addy Osmani

Head of Chrome Developer Experience at Google

Travis Fischer

Founder of Agentic

and 4 more!

Project Summary

A minimal, self-healing browser harness empowers LLMs to autonomously complete any web task by directly interacting with Chrome via the Chrome DevTools Protocol (CDP). It targets developers and researchers seeking robust browser automation without traditional framework overhead, enabling LLMs to dynamically adapt and extend their capabilities by editing harness code mid-task.

How It Works

The harness establishes a direct WebSocket connection to Chrome using CDP, eliminating intermediate layers for maximum simplicity and speed. Its core innovation is a "self-healing" mechanism: when an LLM encounters a missing function during task execution, it can directly edit the harness's Python code (e.g., helpers.py) to implement the required functionality. Additionally, the system automatically generates reusable "domain skills" based on successful task executions, capturing site-specific selectors and workflows for future use by the agent.

Quick Start & Requirements

Primary install / run command: Follow install.md for initial setup and browser connection. Use run.py for execution with preloaded helpers.
Non-default prerequisites: Python environment, Chrome browser instance accessible via Chrome DevTools Protocol (CDP).
Links: install.md, SKILL.md, cloud.browser-use.com/new-api-key (for remote browser keys), docs.browser-use.com/llms.txt (for setup flow).

Highlighted Details

Self-Healing & Agent-Driven Code Generation: LLMs can directly edit harness code (helpers.py) to implement missing functions mid-task.
Automated Skill Discovery: The agent automatically generates reusable "domain skills" based on successful task executions, capturing site-specific interactions.
Minimalist Architecture: Built directly on CDP with a single WebSocket connection, avoiding complex frameworks or abstractions.
Free Remote Browsers: Offers a free tier for remote browser instances suitable for sub-agents or deployment.

Maintenance & Community

Contributions are encouraged, particularly new domain skills under domain-skills/. Primary interaction points appear to be Pull Requests and GitHub Issues. No specific community channels (Discord/Slack) or roadmap links are provided in the text.

Licensing & Compatibility

License information is not explicitly stated in the provided README text.

Limitations & Caveats

The "self-healing" mechanism relies on the LLM's ability to correctly edit Python code, which may introduce errors or require careful prompt engineering. The effectiveness of automated skill generation depends on the LLM's performance and the complexity of the target websites. No explicit mention of supported operating systems or browser versions beyond the need for Chrome.

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2,185 stars in the last 30 days