playwright-mcp by microsoft

MCP server for browser automation via Playwright

Created 9 months ago

25,342 stars

Top 1.5% on SourcePulse

View on GitHub

14 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Wes McKinney

Author of Pandas

Mckay Wrigley

Founder of Takeoff AI

Jonathan Ragan-Kelley

Professor at MIT

and 10 more!

Project Summary

This project provides a Model Context Protocol (MCP) server that leverages Playwright for browser automation, enabling Large Language Models (LLMs) to interact with web pages. It offers LLM-friendly, deterministic tool application via structured accessibility snapshots, bypassing the need for vision models and screenshots.

How It Works

The server operates by capturing Playwright's accessibility tree snapshots, which are then parsed into a structured format suitable for LLMs. This approach avoids the computational overhead and potential ambiguities of image-based analysis. Users can opt into "Vision Mode" to use screenshots for visual-based interactions, which is beneficial for LLMs that process coordinate-based inputs.

Quick Start & Requirements

Installation: npx @playwright/mcp@latest or via VS Code CLI: code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
Prerequisites: Node.js, Playwright browser binaries (automatically installed by Playwright).
Configuration: Supports JSON configuration files for detailed setup.
Docker: Available as mcp/playwright image, supporting headless chromium.
Docs: Playwright API Docs

Highlighted Details

Supports both "Snapshot Mode" (default, accessibility tree) and "Vision Mode" (screenshots).
Offers a wide range of browser interaction tools, including navigation, form filling, element interaction, tab management, and file uploads.
Can handle browser dialogs and save pages as PDFs.
Provides programmatic usage via Node.js SDK.

Maintenance & Community

Developed by Microsoft.
Community channels and roadmap information are not explicitly detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Given the association with Microsoft and Playwright, it is likely to be a permissive license, but verification is recommended for commercial use.

Limitations & Caveats

The Docker implementation currently only supports headless chromium.
Vision Mode requires LLMs capable of coordinate-based interaction.
Detailed community support and roadmap are not readily available in the provided README.

Health Check

Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

1,152 stars in the last 30 days