fetcher-mcp  by jae-jae

MCP server for fetching web page content using Playwright

created 4 months ago
789 stars

Top 45.3% on sourcepulse

GitHubView on GitHub
Project Summary

Fetcher MCP provides a server for fetching web page content using Playwright, designed for users who need to handle dynamic JavaScript-rendered content and extract main article text. It offers intelligent content extraction, flexible output formats (HTML/Markdown), and parallel processing capabilities, making it suitable for researchers and developers building content aggregation or analysis tools.

How It Works

Fetcher MCP leverages Playwright to control headless Chromium browsers, enabling it to execute JavaScript and interact with modern web applications. It features an integrated Readability algorithm for intelligent content extraction, stripping away boilerplate like ads and navigation. The server also optimizes bandwidth by blocking non-essential resources and provides robust error handling for reliable operation.

Quick Start & Requirements

  • Install Playwright browsers: npx playwright install chromium
  • Run the server: npx -y fetcher-mcp
  • Debug mode: npx -y fetcher-mcp --debug
  • Configuration for Claude Desktop: See README for macOS/Windows paths.
  • Requires Node.js and npm/npx.

Highlighted Details

  • Supports JavaScript execution via Playwright.
  • Intelligent content extraction with Readability algorithm.
  • Parallel fetching of multiple URLs via fetch_urls tool.
  • Resource optimization by blocking images, stylesheets, fonts, and media.
  • Configurable parameters for timeouts, content extraction, and output format.

Maintenance & Community

No specific contributors, sponsorships, or community links (Discord/Slack) are mentioned in the README.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The README suggests using waitForNavigation: true and increasing timeouts for anti-crawler mechanisms or slow-loading sites, indicating potential challenges with certain dynamic or protected websites.

Health Check
Last commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
1
Issues (30d)
0
Star History
144 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

firecrawl by mendableai

2.1%
44k
API service for turning websites into LLM-ready data
created 1 year ago
updated 21 hours ago
Feedback? Help us improve.