playwright-mcp  by microsoft

MCP server for browser automation via Playwright

Created 6 months ago
19,863 stars

Top 2.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a Model Context Protocol (MCP) server that leverages Playwright for browser automation, enabling Large Language Models (LLMs) to interact with web pages. It offers LLM-friendly, deterministic tool application via structured accessibility snapshots, bypassing the need for vision models and screenshots.

How It Works

The server operates by capturing Playwright's accessibility tree snapshots, which are then parsed into a structured format suitable for LLMs. This approach avoids the computational overhead and potential ambiguities of image-based analysis. Users can opt into "Vision Mode" to use screenshots for visual-based interactions, which is beneficial for LLMs that process coordinate-based inputs.

Quick Start & Requirements

  • Installation: npx @playwright/mcp@latest or via VS Code CLI: code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
  • Prerequisites: Node.js, Playwright browser binaries (automatically installed by Playwright).
  • Configuration: Supports JSON configuration files for detailed setup.
  • Docker: Available as mcp/playwright image, supporting headless chromium.
  • Docs: Playwright API Docs

Highlighted Details

  • Supports both "Snapshot Mode" (default, accessibility tree) and "Vision Mode" (screenshots).
  • Offers a wide range of browser interaction tools, including navigation, form filling, element interaction, tab management, and file uploads.
  • Can handle browser dialogs and save pages as PDFs.
  • Provides programmatic usage via Node.js SDK.

Maintenance & Community

  • Developed by Microsoft.
  • Community channels and roadmap information are not explicitly detailed in the README.

Licensing & Compatibility

  • The README does not explicitly state a license. Given the association with Microsoft and Playwright, it is likely to be a permissive license, but verification is recommended for commercial use.

Limitations & Caveats

  • The Docker implementation currently only supports headless chromium.
  • Vision Mode requires LLMs capable of coordinate-based interaction.
  • Detailed community support and roadmap are not readily available in the provided README.
Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
67
Issues (30d)
111
Star History
2,488 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Simon Willison Simon Willison(Coauthor of Django), and
1 more.

Lumos by andrewnguonly

0%
2k
Chrome extension for local LLM web RAG co-piloting
Created 1 year ago
Updated 7 months ago
Feedback? Help us improve.