browserbee  by parsaghaffari

AI browser assistant for natural language web control

Created 4 months ago
910 stars

Top 40.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

BrowserBee is an open-source Chrome extension that acts as a privacy-first AI-powered browser assistant, enabling users to control web browsing with natural language commands. It targets users who want a personal AI assistant for tasks like social media management, news curation, research, and knowledge summarization, offering convenience and security by running primarily within the browser.

How It Works

BrowserBee leverages a combination of a Large Language Model (LLM) for understanding and planning user instructions and Playwright for robust browser automation. This approach allows it to interact with web pages, execute complex sequences of actions, and maintain privacy by processing data locally. The integration of Playwright within a browser extension is highlighted as a novel way to simplify browser automation for end-users compared to traditional backend service-browser architectures.

Quick Start & Requirements

  • Installation: Download latest release, unzip, and load unpacked extension in Chrome (chrome://extensions/ -> Developer mode -> Load unpacked). Alternatively, build from source (npm install or pnpm install, then npm run build or pnpm build) and load the dist directory, or install from the Chrome Web Store.
  • Prerequisites: LLM API keys for supported providers (Anthropic, OpenAI, Gemini) or Ollama configuration.
  • Usage: Open the side panel (toolbar icon or Alt+Shift+B), enter a natural language command, and press Enter.
  • Notes: Requires an open base tab for CDP attachment; cannot attach to chrome:// or chrome-extension:// URLs.
  • Documentation: ROADMAP.md

Highlighted Details

  • Supports major LLM providers (OpenAI, Anthropic, Gemini, Ollama) and includes token usage tracking.
  • Features a comprehensive set of browser interaction tools, including navigation, tab management, element interaction, DOM querying, and screenshotting.
  • Includes a "memory" feature to save and reuse efficient tool sequences, potentially reducing token costs.
  • Agents can request user approval for sensitive actions like purchases or social media posts.

Maintenance & Community

  • The project is actively developed by parsaghaffari.
  • Contribution guidelines are available in CONTRIBUTING.md.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

Interacting with web pages remains a challenging task for LLM agents due to the low information density of DOMs and screenshots, requiring simplified representations and efficient models for optimal performance.

Health Check
Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
78 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
15 more.

stagehand by browserbase

0.5%
17k
AI browser automation framework for production
Created 1 year ago
Updated 1 day ago
Starred by Kevin Hou Kevin Hou(Head of Product Engineering at Windsurf), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
29 more.

browser-use by browser-use

0.6%
70k
SDK for AI agent browser control
Created 10 months ago
Updated 1 day ago
Feedback? Help us improve.