browserbee  by parsaghaffari

AI browser assistant for natural language web control

Created 6 months ago
935 stars

Top 39.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

BrowserBee is an open-source Chrome extension that acts as a privacy-first AI-powered browser assistant, enabling users to control web browsing with natural language commands. It targets users who want a personal AI assistant for tasks like social media management, news curation, research, and knowledge summarization, offering convenience and security by running primarily within the browser.

How It Works

BrowserBee leverages a combination of a Large Language Model (LLM) for understanding and planning user instructions and Playwright for robust browser automation. This approach allows it to interact with web pages, execute complex sequences of actions, and maintain privacy by processing data locally. The integration of Playwright within a browser extension is highlighted as a novel way to simplify browser automation for end-users compared to traditional backend service-browser architectures.

Quick Start & Requirements

  • Installation: Download latest release, unzip, and load unpacked extension in Chrome (chrome://extensions/ -> Developer mode -> Load unpacked). Alternatively, build from source (npm install or pnpm install, then npm run build or pnpm build) and load the dist directory, or install from the Chrome Web Store.
  • Prerequisites: LLM API keys for supported providers (Anthropic, OpenAI, Gemini) or Ollama configuration.
  • Usage: Open the side panel (toolbar icon or Alt+Shift+B), enter a natural language command, and press Enter.
  • Notes: Requires an open base tab for CDP attachment; cannot attach to chrome:// or chrome-extension:// URLs.
  • Documentation: ROADMAP.md

Highlighted Details

  • Supports major LLM providers (OpenAI, Anthropic, Gemini, Ollama) and includes token usage tracking.
  • Features a comprehensive set of browser interaction tools, including navigation, tab management, element interaction, DOM querying, and screenshotting.
  • Includes a "memory" feature to save and reuse efficient tool sequences, potentially reducing token costs.
  • Agents can request user approval for sensitive actions like purchases or social media posts.

Maintenance & Community

  • The project is actively developed by parsaghaffari.
  • Contribution guidelines are available in CONTRIBUTING.md.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

Interacting with web pages remains a challenging task for LLM agents due to the low information density of DOMs and screenshots, requiring simplified representations and efficient models for optimal performance.

Health Check
Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
19 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
15 more.

stagehand by browserbase

0.6%
19k
AI browser automation framework for production
Created 1 year ago
Updated 15 hours ago
Feedback? Help us improve.