MCP Sampling Needs Its Own Approval Gate

MCP sampling is not just another tool setting. It lets an MCP server ask the client to run a model call.

That deserves its own approval gate.

Short answer

Treat MCP sampling as a server-initiated model call. Before you enable it, review which server can request sampling, what prompt reaches the model, what context is included, whether sampling can use tools, what the generated response returns to the server, and where the decision is logged. If you cannot inspect and deny the request before the model call, sampling is too trusted for a sensitive workflow.

Why sampling changes the trust boundary

Most MCP reviews start with tools.

What tools does this server expose? What can they read? What can they write? Does the client show a confirmation prompt before a tool runs?

That review is still necessary, but sampling adds a different question.

Can the server ask the client to run the model?

The MCP sampling specification describes sampling as a way for servers to request LLM sampling from clients while clients maintain control over model access, model selection, and permissions.

In plain English: an MCP server can ask the client to send a prompt to an LLM and return the result.

That is useful. A server may need the model to summarize, transform, plan, or reason over something it cannot do alone.

It is also a new trust boundary.

The server is no longer only exposing tools or resources. It is asking the client to spend model capability, potentially include context, and return a generated answer.

If that path is automatic, the server has more influence than most approval screens make obvious.

The spec already points at the gate

This is not only a defensive interpretation.

The MCP sampling specification says there should always be a human in the loop with the ability to deny sampling requests. It also says applications should make sampling requests easy to review, allow users to view and edit prompts before sending, and present generated responses for review before delivery.

That gives builders a practical control shape:

Review the request before the model call.
Review the response before it returns to the server.
Log both decisions.

The important word is both.

A pre-call review catches bad prompts, broad context, tool-enabled loops, and requests from servers that should not be asking for model access.

A post-call review catches sensitive output, unexpected tool results, and model completions that should not be returned to the server.

If your MCP client only asks, "Do you trust this server?" at install time, that is not enough for sampling.

Sampling can become a tool loop

Sampling gets sharper when tools enter the picture.

The sampling specification includes support for tool-enabled sampling when the client declares the relevant capability. The source card for the spec notes that servers can request that the client's LLM use tools during sampling.

Keep reading with free field-guide resources.

VibeSec Advisory publishes practical research, Skills, workflow examples, MCP notes, prompt injection tests, and AI red-team lessons for builders working with agentic AI.

Read the research Browse Skills

Now the review question changes again.

It is not only, "Can this server ask for a completion?"

It is, "Can this server ask the client's model to call tools, inspect results, and keep going?"

That is a nested control plane. The MCP server asks for sampling. The model may call tools. Tool results may return to the model. The final response may return to the server.

Each step can cross a data boundary.

The MCP tools specification says MCP tools are model-controlled, meaning the language model can discover and invoke tools automatically based on context and the user's prompt. It also says there should always be a human in the loop with the ability to deny tool invocations.

Put those together and the safer pattern is clear.

Sampling approval and tool approval should not be collapsed into one vague trust prompt.

What can go wrong

Unit 42's MCP sampling research describes sampling as reversing the usual client-driven pattern because an MCP server can proactively request LLM completions from the client.

Their proof-of-concept work describes three attack classes in a coding copilot context:

Resource theft
Conversation hijacking
Covert tool invocation

Treat that as a practical warning, not as proof that every MCP sampling implementation is broken.

The useful lesson is narrower: if a malicious or compromised server can request sampling, it may try to shape the prompt, consume model resources, steer the conversation, or trigger tool behavior the user did not intend.

That maps cleanly to two OWASP risk categories.

OWASP LLM01: Prompt Injection describes inputs that alter the LLM's behavior or output in unintended ways. OWASP also notes that indirect prompt injection can come from external sources such as websites or files.

OWASP LLM06: Excessive Agency focuses on excessive functionality, permissions, and autonomy. OWASP recommends minimizing extensions, minimizing extension functionality, avoiding open-ended extensions, minimizing permissions, and requiring approval for high-impact actions.

MCP sampling can touch both.

A server-supplied sampling prompt can become an instruction source. Tool-enabled sampling can turn that instruction source into action if permissions are broad enough.

The sampling approval record

Before enabling sampling for an MCP server, write this down:

Requesting MCP server:
Why this server needs sampling:
Prompt shown to reviewer:
Context included:
Model or model class requested:
Can sampling use tools:
Specific tools allowed:
Data those tools can read:
Actions those tools can take:
Generated response reviewed before return:
Default decision if unclear:
Log location:
Owner:
Review date:

If the workflow is sensitive, do not rely on memory. Put the approval record somewhere a reviewer can inspect later.

This record is small on purpose. If a team cannot fill it out, they probably cannot explain the sampling trust boundary.

Deny or narrow these requests

A sampling request should be denied or narrowed when:

The reviewer cannot see the prompt before it reaches the model
The request includes broad context from other servers or the whole client session
The server asks for tool-enabled sampling without a specific reason
The allowed tools can read private data and communicate externally in the same path
The generated response returns sensitive information to a low-trust server
The server can repeat sampling calls without a quota, rate limit, or review trail
The implementation has no durable log of the request, decision, and returned response

This is the same shape as any good agent approval gate. The difference is that the action is a model call requested by a server, not a normal user prompt.

A safer default

Start with sampling disabled for third-party or low-trust MCP servers.

If a workflow needs it, enable the smallest version:

Allow sampling only for named servers.
Require review before the prompt is sent.
Limit included context.
Disable tool-enabled sampling unless the use case needs it.
Require separate approval for any tool call inside sampling.
Review the generated response before returning it to the server.
Log the request, prompt, context class, decision, response class, and reviewer.

The goal is not to make sampling impossible.

The goal is to keep a server-initiated model call from becoming an invisible authority path.

Evidence versus opinion

Evidence from the sources:

The MCP sampling specification says clients should maintain control over model access, selection, and permissions.
The MCP sampling specification calls for a human in the loop with the ability to deny sampling requests, plus prompt review before sending and response review before delivery.
The MCP tools specification says tools are model-controlled and calls for a human-deniable path for tool invocations.
Unit 42 describes proof-of-concept sampling abuse classes including resource theft, conversation hijacking, and covert tool invocation.
OWASP LLM01 frames indirect prompt injection as external content altering model behavior.
OWASP LLM06 frames excessive agency as too much functionality, permission, or autonomy in an LLM-based system.

My opinion:

Sampling needs its own approval gate because it gives an MCP server a way to ask the client for model work. That work may include context, tools, and a response sent back to the server. A one-time server trust prompt is not a meaningful control for that path.

If the request, prompt, context, tools, response, and log are not reviewable, do not enable sampling for sensitive workflows.

Free next step

Test one MCP server you use. Look for sampling support, then write the approval record above. If sampling can use tools, compare the workflow to the MCP permission review, the lethal trifecta test, and the AI approval gate checklist before you turn it on.

AI Workflows Weekly

Read the archive

Practical notes on governed AI workflows, guardrails, and safer automation. No spam, unsubscribe anytime.

Keep testing agentic AI risk.

VibeSec Advisory is a free field guide. Use the research archive, Skill Library, and workflow examples to keep improving what you are building.