Skip to content

Model Context Protocol · Web Scraping

Web Scraping MCP servers

Web-scraping MCP servers turn arbitrary URLs into clean, model-ready content — Firecrawl, Apify, ScrapingBee, and the Fetch server. They handle rendering, pagination, and markdown extraction so the agent gets text instead of a wall of HTML.

A Web Scraping MCP server is a small process that speaks the Model Context Protocol over stdio or SSE, so an AI coding agent can call its tools directly instead of you copying output back and forth. The web scraping servers collected here are ingested from public GitHub repos with their real author, star count, and SPDX license on display — no rewritten READMEs, no anonymous uploads.

Ranking is by quality score: source freshness, schema completeness, stars, and review signals, refreshed daily. Because the protocol is client-agnostic, every server below runs unchanged across Claude Code, Cursor, Windsurf, Cline, and Codex — install once, use everywhere. And because security research flagged roughly a third of public MCP servers as carrying SSRF vulnerabilities, every listing here is scanned before it appears, so you can wire web scraping access into an autonomous agent without handing it a foot-gun.

Web Scraping MCP servers on RuleSell

Ranked by quality score — freshness, schema completeness, stars, and review signals. Refreshed daily.

How to add a Web Scraping MCP server

  1. 1. Pick a server

    Open a listing above and check its source repo, license, and required environment variables (API keys, connection strings).

  2. 2. Register it with your client

    In Claude Code, use the CLI. In Cursor, add it under Settings → MCP. The command below registers a typical stdio server:

  3. 3. Restart and verify

    Restart the client so it spawns the server, then confirm its tools appear in the active tool list before you let an agent call them.

# Claude Code (stdio server via the CLI)
claude mcp add web-scraping -- npx -y <package-name>

# Or edit ~/.claude.json directly:
{
  "mcpServers": {
    "web-scraping": {
      "command": "npx",
      "args": ["-y", "<package-name>"],
      "env": { "API_KEY": "..." }
    }
  }
}

Web Scraping MCP servers — frequently asked

What's the best Web Scraping MCP server?

It depends on your stack, but the servers listed on this page are ranked by RuleSell's quality score — a blend of source freshness, schema cleanliness, GitHub stars, and review signals. Web-scraping MCP servers turn arbitrary URLs into clean, model-ready content — Firecrawl, Apify, ScrapingBee, and the Fetch server. They handle rendering, pagination, and markdown extraction so the agent gets text instead of a wall of HTML. Start at the top of the list and pick the one whose source repo matches the tools you already use.

How do I install a Web Scraping MCP server in Claude Code or Cursor?

Add the server to your MCP config. In Claude Code, run "claude mcp add <name> -- <command>" or edit ~/.claude.json. In Cursor, open Settings → MCP and add the server's command and args. Restart the client so it spawns the server over stdio, then check that its tools appear in the active tool list.

Are these Web Scraping MCP servers audited?

Every MCP server on RuleSell is ingested from a public GitHub repo with SPDX license gating and schema validation, and flagged listings go through manual admin review. Automated scanning (VirusTotal, Semgrep, sandbox observation) is on the v2 roadmap, not in production today — recent research found 36.7% of public MCP servers carry SSRF vulnerabilities, so review the source repo before granting a server broad permissions. The real state of our checks is always published at /trust.

Do Web Scraping MCP servers work outside Claude Code?

Yes. The Model Context Protocol is client-agnostic, so these servers run with Claude Code, Cursor, Windsurf, Cline, Codex, and any MCP-compatible client. One install works across every tool that speaks MCP.

More on RuleSell