MCP is becoming the USB‑C of agent tooling—a universal interface that lets LLMs query data, call APIs, and take real‑world actions. Adoption is accelerating fast.

But every convenience layer in security carries the same risk: teams connect it before they’ve defined the capability model. MCP is no exception. It creates a new trust boundary, and most deployments aren’t treating it like one.

If you’re deploying MCP (or any agent tool protocol), the right mental model isn’t “prompt injection is an LLM bug we can filter.” The right mental model is: MCP is a capability system. Once you see it that way, the security work becomes familiar—permissions, isolation, auditing, and blast‑radius control.


The Trust Boundary Shift

Traditional apps have well-understood trust boundaries: user input (untrusted), application logic (trusted), and external APIs (trusted via authz/authn). Agents blur these lines because the LLM sits in the middle, translating language into actions.

MCP formalizes that translation layer. It standardizes how tools are described, invoked, and how results are returned. That’s good. But it also means more tools will be connected faster, more services will expose “agent-ready” interfaces, and more developers will treat tool integration as plumbing rather than security.

And that’s where the trouble starts.


Sampling: When the Server Shapes the Agent

Most MCP interactions follow a simple pattern: your agent asks a server for something (a database query, an API call), and the server returns data. The agent stays in control.

Sampling flips this. With sampling enabled, an MCP server can ask your agent’s LLM to generate text on its behalf—meaning an external server can shape what your model thinks about next, not just what data it returns.

Why does this matter? Because now a malicious or compromised server doesn’t just return bad data—it can inject prompts that influence your agent’s next actions. The direction of influence reverses.

Security researchers have documented prompt-injection and tool-misuse paths that emerge when MCP servers can influence agent workflows.

Reference: https://unit42.paloaltonetworks.com/model-context-protocol-attack-vectors/

A concrete example: imagine an MCP “ticket” tool that returns an issue description containing hidden instructions. The agent treats it as normal context, fetches internal docs to be “helpful,” and then writes a summary into another system (a comment, a new ticket, a log, a shared note). No exploit needed—just influence plus permissions.

Sampling policy (simple default):

  • Off by default.
  • Only enable for servers you own or explicitly trust.
  • Never allow sampling to trigger write actions without an explicit user confirmation gate.

A Practical MCP Security Checklist (PM-friendly)

  1. Treat every MCP server as a third‑party integration
  • Allowlist approved servers and endpoints.
  • Assign ownership (who reviews changes? who responds to incidents?).
  • Version and change-manage tool definitions.
  1. Default‑deny capabilities; grant narrowly
  • Start read‑only.
  • Separate read tools from write tools.
  • Gate “admin” tools behind extra confirmation.
  1. Enforce authorization at the tool boundary
  • Scope tokens to tool/resource/time.
  • Don’t rely on the LLM to self‑police.
  1. Require explicit user confirmation for high‑risk actions Especially for: sending messages/emails, creating calendar events, writing to external systems, exporting/sharing files.

In agent systems, confirmation is a security control.

  1. Log tool calls as security telemetry Log: tool name, target resource, redacted parameters, result size/type.

Alert on: read → write → send, unusual sequences, repeated small reads over time.

  1. Add quotas and circuit breakers
  • Per‑server budgets.
  • Per‑tool rate limits.
  • Caps on expensive actions.
  1. Isolate MCP servers
  • Containers/sandboxes.
  • Restrict network egress.
  • Treat tool servers like any other untrusted service boundary.

Where I Land

MCP will accelerate the agent ecosystem. The winning teams won’t be the ones with the fanciest prompts—they’ll be the ones that treat MCP as security plumbing: least privilege, isolation, and auditability.

Ship the capability model first. Then let the model be smart inside the box.

Disclaimer: The views expressed here are my own and do not represent those of my employer.