openbrowser by ntegrals

Autonomous web browsing toolkit for AI agents

Created 7 years ago

9,490 stars

Top 5.5% on SourcePulse

Project Summary

An autonomous web browsing framework for AI agents, Open Browser empowers AI agents to navigate, interact with, and extract data from any website using natural language instructions. It targets developers and researchers building AI-driven automation, offering multi-model support, an interactive REPL for debugging, and production-ready features to accelerate the development and deployment of browser-based AI solutions.

How It Works

The core architecture features an Agent that orchestrates interactions between a chosen Language Model (OpenAI, Anthropic, Google) and a Playwright-powered browser Viewport. The Agent receives a task, queries the LLM with the current page state to generate commands (e.g., click, type, extract), executes these commands via the Viewport, and then observes the results to iterate towards task completion. This approach enables sophisticated autonomous navigation and data extraction, leveraging LLM reasoning for dynamic web interactions.

Quick Start & Requirements

Installation requires the Bun runtime (bun install). Users must configure LLM API keys by copying .env.example to .env. Agents can be run via bun run open-browser run "Your task description" or launched interactively with bun run open-browser interactive. Key dependencies include LLM API access and the Bun JavaScript runtime.

Highlighted Details

Autonomous Agents: AI agents complete tasks described in natural language by interacting with websites.
Multi-Model Support: Seamless integration with OpenAI, Anthropic, and Google models via the Vercel AI SDK.
Interactive REPL: A live browser session for debugging, prototyping, and direct command execution.
Sandboxed Execution: Agents can run with resource limits (CPU, memory), timeouts, and domain restrictions.
Production-Ready: Features include stall detection, cost tracking, session management, and replay recording.
Extensive Command Set: Over 25 built-in commands for browser control and data manipulation.

Maintenance & Community

Contributions are welcomed, with a dedicated CONTRIBUTING.md file available. The project is open source and actively maintained. No specific community channels (e.g., Discord, Slack) or major sponsorships are detailed in the README.

Licensing & Compatibility

The project is released under the MIT license. This permissive license allows for broad compatibility, including commercial use and integration within closed-source applications without significant restrictions.

Limitations & Caveats

Requires the Bun runtime environment for installation and execution. The effectiveness of autonomous agents is inherently dependent on the performance and capabilities of the selected LLM. While production-ready, highly complex or dynamically changing web interfaces may still present challenges for full automation.

openbrowser by ntegrals

Explore Similar Projects

dendrite-python-sdk by dendrite-systems

browser by CognosysAI

oxylabs-ai-studio-py by oxylabs

fuji-web by normal-computing

TheAgenticBrowser by TheAgenticAI

molmoweb by allenai

mcp-browser-use by Saik0s

awesome-web-agents by steel-dev

codel by semanser

ANUS by anus-dev

nanobrowser by nanobrowser

stagehand by browserbase