browserbee  by parsaghaffari

AI browser assistant for natural language web control

created 3 months ago
814 stars

Top 44.4% on sourcepulse

GitHubView on GitHub
Project Summary

BrowserBee is an open-source Chrome extension that acts as a privacy-first AI-powered browser assistant, enabling users to control web browsing with natural language commands. It targets users who want a personal AI assistant for tasks like social media management, news curation, research, and knowledge summarization, offering convenience and security by running primarily within the browser.

How It Works

BrowserBee leverages a combination of a Large Language Model (LLM) for understanding and planning user instructions and Playwright for robust browser automation. This approach allows it to interact with web pages, execute complex sequences of actions, and maintain privacy by processing data locally. The integration of Playwright within a browser extension is highlighted as a novel way to simplify browser automation for end-users compared to traditional backend service-browser architectures.

Quick Start & Requirements

  • Installation: Download latest release, unzip, and load unpacked extension in Chrome (chrome://extensions/ -> Developer mode -> Load unpacked). Alternatively, build from source (npm install or pnpm install, then npm run build or pnpm build) and load the dist directory, or install from the Chrome Web Store.
  • Prerequisites: LLM API keys for supported providers (Anthropic, OpenAI, Gemini) or Ollama configuration.
  • Usage: Open the side panel (toolbar icon or Alt+Shift+B), enter a natural language command, and press Enter.
  • Notes: Requires an open base tab for CDP attachment; cannot attach to chrome:// or chrome-extension:// URLs.
  • Documentation: ROADMAP.md

Highlighted Details

  • Supports major LLM providers (OpenAI, Anthropic, Gemini, Ollama) and includes token usage tracking.
  • Features a comprehensive set of browser interaction tools, including navigation, tab management, element interaction, DOM querying, and screenshotting.
  • Includes a "memory" feature to save and reuse efficient tool sequences, potentially reducing token costs.
  • Agents can request user approval for sensitive actions like purchases or social media posts.

Maintenance & Community

  • The project is actively developed by parsaghaffari.
  • Contribution guidelines are available in CONTRIBUTING.md.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

Interacting with web pages remains a challenging task for LLM agents due to the low information density of DOMs and screenshots, requiring simplified representations and efficient models for optimal performance.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
797 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Mckay Wrigley Mckay Wrigley(Founder of Takeoff AI), and
1 more.

chatGPTBox by ChatGPTBox-dev

0.1%
11k
Browser extension for ChatGPT integration
created 2 years ago
updated 1 week ago
Feedback? Help us improve.