Thesis
MCP is the right protocol. It's also the wrong trust model for production.
Eighteen months in, Model Context Protocol has become the default way agents reach tools — and the default way agents get owned. The CoSAI workstream approved in January catalogs twelve threat categories and roughly forty distinct attacks against MCP deployments. A March arXiv study tested seven major MCP clients against tool poisoning and found five of them accept server-provided metadata with no static validation. Supabase's MCP server shipped a prompt-injection-via-support-tickets bug that exposed private tables to connected clients.
None of this means MCP is broken. The protocol is a clean abstraction for a real problem. But the trust model is wrong for anything touching enterprise data, and most teams are treating the spec's "SHOULD" clauses as if they read "MAY". Our position: adopt MCP, then wrap it in the controls the spec declines to require. Here's the full shape of the argument.
Prerequisites
You've shipped at least one agent or tool-calling system in production
You know what a tool schema is and why tool definitions shape LLM behavior
You understand that LLMs treat any tokens arriving in context as potentially authoritative — including tokens returned by tool responses
What MCP is vs what it isn't
MCP is a protocol specification with a client-server architecture. Hosts run clients (Claude Desktop, Cursor, various IDEs). Clients connect to servers that expose tools, resources, and prompts to the model. The wire format is JSON-RPC with a capability-discovery handshake.
MCP isn't an agent framework. It doesn't decide what to call, when, or why. It doesn't provide authorization, audit, or runtime isolation — those are implementation concerns left to hosts and servers. It isn't a security boundary. The spec itself states that clients SHOULD provide a human-in-the-loop for tool invocation — a SHOULD the ecosystem has broadly ignored.
The distinction matters because most critiques of MCP conflate the protocol with the typical deployment. The protocol is minimal. The deployment pattern — install N servers, grant them broad scopes, trust their tool descriptions — is what creates the risk.
The genuine pros
Standardization, not the word but the thing. Before MCP, teams wrote a custom adapter per LLM provider per tool. Now one server speaks to every MCP client. The lock-in tax on AI tooling dropped materially.
Composition. Tools from different providers work together without bespoke glue. An MCP server for Supabase and one for Slack can participate in the same agent run.
Ecosystem scale. The MCPCorpus dataset documents roughly 13,875 servers and 300 clients. Network effects are real and working in the protocol's favor — picking an alternative now is picking against a compounding standard.
It matches how teams already build. Most organizations were already writing ad-hoc tool wrappers before MCP existed. The protocol gave the wrapper a shape without forcing a framework on top.
The genuine cons
Implicit trust by default. The spec lets servers declare their own tool schemas with no required client-side validation. Empirical testing across seven major clients found five of them accept metadata without static validation. A malicious server's tool description becomes part of the model's instruction context, and the model has no way to know it was authored by an attacker rather than the user.
Scope sprawl. The OAuth-flavored auth extension permits servers to request broad scopes up front. Users see a consent screen listing files:*, db:*, admin:* and either click through or abandon the workflow. The protocol doesn't enforce minimal initial scope — the security document suggests it.
Rug-pull tool mutation. MCP tool definitions can change after a server is installed and approved. Day 1 the tool is safe; Day 7 it rewrites its own description to exfiltrate secrets. Most clients don't notify users when descriptions change, which makes the approval gate a one-shot decision against a moving target.
Content poisoning as persistent injection. The CoSAI taxonomy names resource content poisoning as its own category: instructions embedded in data the server returns. Tickets, database rows, documents — any field the model reads can carry text that executes as an instruction. This isn't strictly an MCP bug; it's a consequence of LLMs not separating data from commands. But MCP expands the blast radius because it standardizes the data flow.
Sampling reverses the trust direction. The sampling primitive lets servers request completions from the client's LLM. Unit 42 identified three practical attacks on this flow — resource theft, conversation hijacking, and covert tool invocation — each demonstrated in a production copilot. Most deployments have sampling enabled without understanding what the feature actually permits.
Our position
Adopt MCP. Treat every server as untrusted. Enforce at the host layer what the spec only suggests.
Concretely:
Static validation of all server-provided metadata. Fail closed on schema changes without explicit re-consent. Show the user the diff before accepting it.
Scope minimization by default. Start with the smallest useful scope. Elevate per-call when a privileged tool is first invoked. Reject servers that demand omnibus scopes at install time.
Treat tool responses as untrusted input. The same posture you apply to form submissions. If the model is going to act on the response, the response gets sanitized or quarantined before it reaches the next turn.
Human approval for write-path tools, always. No convenience exceptions. If the spec says SHOULD and the action has a side effect, read it as MUST.
Durable audit of every tool call — which server, which tool, which scope, which actor, what outcome. If you can't replay the decision, you can't defend it to a regulator or a customer.
Kill sampling unless you specifically need it. The primitive's attack surface isn't worth the convenience for most deployments.
This is host-side engineering that MCP does not do for you. Teams adopting MCP without it are counting the protocol's win while deferring the trust work.
The reframe worth holding onto
MCP is a transport standard, not a trust system. Treating it as a trust system is the category error producing most current incidents.
For teams building compliance-sensitive or multi-tenant products, the implication is sharper: you can't responsibly connect an MCP server to a system of record without a deterministic control plane around it — idempotency, durable execution, explicit state, policy gates. MCP gives the agent the reach. The control plane keeps the reach from becoming harm.
References / further reading
MCP Security Best Practices (spec draft) — modelcontextprotocol.io. Read this as the normative document and pay attention to what's MUST versus SHOULD.
CoSAI Workstream 4: Secure Design Patterns for Agentic Systems — MCP Security (approved Jan 2026). The twelve-category threat taxonomy; closest thing the ecosystem has to consensus on attack surface.
Unit 42, New Prompt Injection Attack Vectors Through MCP Sampling (Dec 2025). Three documented PoCs against a production copilot's sampling feature.
Fard et al., Model Context Protocol Threat Modeling and Analyzing Vulnerabilities to Prompt Injection with Tool Poisoning (arXiv:2603.22489, March 2026). Seven-client empirical comparison of client-side validation.
Simon Willison, Model Context Protocol has prompt injection security problems (April 2025). Early, concise practitioner framing of the trust gap.
Red Hat, Model Context Protocol (MCP): Understanding security risks and controls (Nov 2025). Practical guide to command-injection patterns in server implementations.
