What is tool poisoning?

A server advertises a tool with a misleading description — `read_file` actually exfiltrates, `format_output` actually injects. The LLM trusts the description because it's how it knows what tools exist. Practical DevSecOps documented this pattern in 2025. Mitigations: review tool descriptions before approving them in your client, prefer vendor-maintained servers, run unfamiliar servers behind a permission UI that surfaces tool descriptions (Claude Code does, Claude Desktop partially).

What is the lethal trifecta?

Simon Willison's framing: any agent that has (a) access to private data, (b) the ability to communicate externally, and (c) the ability to ingest untrusted content is an exfiltration vector. Combining the three is the bug class. Example: an agent with read access to your inbox, write access to a web fetch tool, and the ability to render an attacker-controlled email body. The MCP server stack makes this trifecta one config-edit away — which is why we publish /topic/best-mcp-servers-2026 with the trifecta test applied.

Did Anthropic actually disagree with Ox Security publicly?

Yes. The Register reported in April 2026 that the Ox Security research team 'repeatedly asked Anthropic to patch the root issue, and were repeatedly told the protocol works just fine.' Ox documented 10+ high and critical CVEs across MCP server implementations. The disagreement is genuinely interesting: is the bug in the protocol's trust model, or in individual server authors not implementing safeguards the protocol leaves to them? We lean toward 'both, but mostly the implementations' — your mileage may vary.

What about the archived Postgres MCP server?

Anthropic archived `@modelcontextprotocol/server-postgres` in 2025. The known issue: a SQL-injection bypass via the `COMMIT; DROP TABLE` pattern that defeated the parameterized-query defense. The replacement most setups now use is crystaldba/postgres-mcp, run in read-only mode. We treat anything other than read-only as a security review item — see the lethal trifecta entry above for why.

What's a 'rug pull' in MCP?

Cycognito coined the term. A maintainer ships a trusted MCP server, builds an install base, then either gets compromised or sells the package — and the next version exfiltrates. The npm and PyPI ecosystems have seen this for years; MCP inherits it. Mitigations: pin versions, prefer vendor-maintained servers, use a registry that tracks ownership changes (the official modelcontextprotocol.io registry is metadata-only; downstream marketplaces like RuleSell are positioned to layer this signal on top).

Are MCP Servers Safe? 66% Scan Findings and 30+ CVEs Later

# Are MCP Servers Safe? The Honest Answer The protocol is fine. Many implementations are not. The number that matters is 66% — that's the share of MCP servers Snyk found security issues in during their January–February 2026 scan. Trend Micro found 492 MCP servers exposed to the public internet with no auth and no TLS. Ox Security published 10+ high and critical CVEs across community servers and went public when Anthropic told them the protocol "works just fine." This page is the map. What's actually broken, what the official docs gloss over, and what to do about it on a Tuesday afternoon when you need to install one server and ship.

Three classes of risk

The threats aren't theoretical. Each of these has a published CVE, a working PoC, or both.

Tool poisoning

A server advertises tools whose descriptions don't match their behavior. read_file(path) quietly logs paths to a remote endpoint. format_output(text) injects content the agent then trusts. The LLM cannot tell — the description is its ground truth. Practical DevSecOps documented the pattern in 2025. The mitigation is unglamorous: read the tool descriptions your client surfaces before you click "Allow." Most clients show them. Most users don't read them.

The lethal trifecta

Simon Willison's framing, and the single most useful security model for MCP. An agent becomes an exfiltration vector when it has all three of:

Access to private data (your inbox, your DB, your filesystem)

The ability to communicate externally (web fetch, email send, webhook POST)

The ability to ingest untrusted content (an email body, a database row written by a third party, a web page)

Pair any two, and the worst case is a misuse bug. Combine all three, and the worst case is an attacker-controlled email exfiltrating your bank statements via a tool the agent thinks is just "fetching a webpage." Practical defense: deny one of the legs per agent. A research agent reads the web but doesn't see private data. A code-review agent reads the repo but doesn't fetch the web. A customer-support agent reads tickets but only writes to vetted endpoints. The "/topic/best-mcp-servers-2026" list scores each server on which legs it adds.

Rug pulls

Cycognito's term. A trusted server gets compromised, a maintainer sells the package, or a community fork goes rogue. The next npm update or uv pip install ships the rugged version. MCP inherits this from the underlying package managers — the protocol itself doesn't change the picture. Mitigations: pin versions in your client config (not "latest"), prefer vendor-maintained servers with signing, and use a registry that tracks ownership changes. The official registry.modelcontextprotocol.io is metadata-only by design — Anthropic explicitly invited downstream marketplaces to layer trust signals on top. RuleSell is one of those downstream layers; so are PulseMCP and Glama.

The Postgres case study

The single most-cited security failure in MCP-land. Worth a paragraph. Anthropic shipped @modelcontextprotocol/server-postgres as a reference server. It accepted SQL queries from an LLM. The team thought parameterized queries blocked SQL injection. They blocked the obvious cases. They did not block the COMMIT; DROP TABLE-style multi-statement pattern that smuggled execution past the parameterizer. Anthropic archived the server in 2025; the crystaldba/postgres-mcp fork is the most-installed replacement. Use read-only mode. The full-access mode is technically functional but combines the trifecta in one config: private data (your DB), external comms (your agent has tools), untrusted input (database rows). A bad row plus a permissive agent plus full SQL access is the exact recipe.

What the Anthropic / Ox Security disagreement is about

The Register reported in April 2026 that the Ox Security research team "repeatedly asked Anthropic to patch the root issue, and were repeatedly told the protocol works just fine." Ox had documented 10+ high and critical CVEs across MCP servers and argued the protocol's trust model itself was at fault. This is genuinely contested. The argument cuts both ways. For Ox: the protocol trusts tool descriptions, trusts servers to enforce their own auth, and leaves permission UX to clients that vary wildly. A protocol that required signed tool descriptions or required per-tool scopes would be harder to ship — and harder to exploit. For Anthropic: every protocol delegates trust to its implementations. HTTPS doesn't prevent compromised CAs. The MCP spec includes auth primitives (OAuth 2.1 with PKCE per Auth0's docs) and permission events. If implementations skip them, that's an implementation problem. We won't pretend to resolve the disagreement. We will say: in practice, both sides are pointing at real problems, and the productive question for someone installing a server today is "what does this specific server's trust model look like" — not "is the protocol broken in the abstract."

What to actually do on Tuesday afternoon

A short checklist. None of it is glamorous.

Three servers, not eight. /topic/mcp-tool-overload covers the context-budget side. Fewer servers also means fewer trust decisions. Both wins.

Vendor first. GitHub, Microsoft Playwright, Stripe, Linear, Notion, Sentry, Atlassian — these have a security.md and an email that gets answered. Community servers can be great, but the maintenance budget varies.

Read the tool descriptions. Your client surfaces them. Skim before approving. If a tool's description doesn't match its name, that's the poisoning signal.

Apply the trifecta test per agent. Deny one of the three legs. A "do everything" agent with seven MCP servers is an exfil vector waiting for a calendar invite.

Pin versions. "latest" is a rug-pull primitive.

Read-only by default for data servers. Postgres, Supabase, filesystem — flip to write only when you've decided the agent's instruction sources are trustworthy.

Run unfamiliar servers locally first. MCP Inspector (official tool) lets you probe a server's surface without an LLM in the loop.

Where this fails

Three honest caveats. 1. We don't audit every server we list. We use vendor disclosures, CVE databases, and community signal. Auditing 23,451 servers is a different business than running a marketplace. 2. Security tooling for MCP is immature. OWASP has a working group; the OWASP MCP Top 10 is being written. SBOM-like artifacts for MCP servers don't exist. Static analyzers don't have MCP-specific rules. In 2026 the state of the art is "read the code or trust the vendor." 3. The compliance story is fragmented. No major framework (SOC 2, ISO 27001, HIPAA) has MCP-specific guidance yet. If you're in regulated industries, you're applying general third-party risk processes to a protocol that wasn't designed with them in mind.

Sources

The Register. "Anthropic MCP design flaw", April 2026 — Ox Security disagreement.

Snyk. State of MCP server security, January–February 2026 scans — 66% of scanned servers had findings.

Trend Micro. "MCP security: network-exposed servers as backdoors" — 492 unauthenticated servers.

Simon Willison. Lethal trifecta posts and Substack analysis.

Practical DevSecOps. "MCP security vulnerabilities" — tool poisoning documentation.

Cycognito. "MCP security and rug-pull risk".

crystaldba. postgres-mcp — replacement for the archived Anthropic Postgres server (SQL-injection bypass context in PR notes).

Anthropic. modelcontextprotocol/servers — 13 reference servers archived.

Anthropic. Registry preview announcement — explicit invitation to downstream marketplaces.

Stack Overflow blog. "Authentication and authorization in MCP", January 2026.

MCP Inspector. Official debugging tool docs.