superpowers-chrome  by obra

Direct Chrome browser control for AI agents

Created 6 months ago
263 stars

Top 96.7% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides direct Chrome browser control for AI agents and clients via the Chrome DevTools Protocol, offering a zero-dependency solution. It targets developers working with AI code agents (like Claude Code) and users of MCP clients, enabling simplified and robust browser automation with an intuitive API and platform independence.

How It Works

The core approach utilizes the Chrome DevTools Protocol (CDP) to interact with a running Chrome instance. It offers two primary modes: Skill Mode, a command-line interface (chrome-ws) for direct control and scripting, and MCP Mode, an ultra-lightweight server designed for seamless integration with MCP clients. This design choice eliminates external dependencies like npm install for the WebSocket server and simplifies interaction through a tab index syntax (e.g., 0, 1, 2) instead of complex WebSocket URLs.

Quick Start & Requirements

Installation can be done via the Claude plugin marketplace (/plugin marketplace add obra/superpowers-marketplace then /plugin install superpowers-chrome@superpowers-marketplace) or directly using npx (npx github:obra/superpowers-chrome). A local installation requires cloning the repository and running npm install within the mcp directory. The primary requirement is a Google Chrome browser installation. For headed mode on Linux/WSL2, the DISPLAY environment variable must be configured. Links to detailed documentation include SKILL.md, EXAMPLES.md, and mcp/README.md.

Highlighted Details

  • Zero Dependencies: Features a built-in WebSocket server, negating the need for external package installations for core functionality.
  • Comprehensive Control: Offers 17 distinct commands covering navigation, interaction (click, fill), extraction, and raw CDP access.
  • Simplified API: Employs an "idiot-proof" tab index syntax for managing browser tabs.
  • Auto-Capture: Automatically captures page HTML, Markdown, screenshots, and DOM summaries after key actions like navigation or clicks.
  • Platform Agnostic: Designed to run seamlessly on macOS, Linux, and Windows.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or project roadmap were found in the provided README.

Licensing & Compatibility

The project is released under the MIT license, which permits broad usage, including commercial applications and integration into closed-source projects, with minimal restrictions.

Limitations & Caveats

Headed mode on Linux/WSL2 requires manual configuration of the DISPLAY environment variable. While platform-agnostic, specific network configurations might be necessary depending on the host environment, particularly concerning the default binding to 127.0.0.1 for DevTools traffic.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
37 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.