MCP Prompts Need a Tool Boundary

An MCP prompt is not harmless because it looks like a template.

It is instruction entering a system that may have tools.

Short answer

Treat MCP prompts as instruction ingress. Before a prompt from an MCP server can influence a model with tool access, review the prompt source, name, body, required arguments, roles, embedded resources, prompt-list change behavior, user selection path, downstream tools, approval gates, and logs. If the prompt can steer a model toward write, send, query, deploy, browser, or shell tools, require explicit user selection, argument validation, least-privilege tools, and a separate approval gate before high-impact tool calls.

Why prompts are a boundary

The MCP prompts specification lets servers expose prompt templates to clients. A client can discover prompts with prompts/list, retrieve a prompt with prompts/get, and pass arguments that customize the prompt.

That sounds simple.

The boundary appears when the prompt enters a model session that can act.

A prompt can tell the model how to summarize, triage, analyze, transform, search, classify, or execute a workflow. It can include arguments. It can include user or assistant roles. It can include embedded resources. It can change when the server's prompt list changes.

That does not make MCP prompts bad. It does make them part of the instruction surface.

If an MCP server can provide instructions to the model, the review question is not only, "Is this prompt useful?"

It is, "What can happen after the model reads it?"

User-controlled does not mean automatically safe

The prompt spec says prompts are designed to be user-controlled and are typically triggered through user-initiated commands in a user interface. That is the right shape.

But the same spec also says the protocol does not mandate a specific interaction model.

That matters.

The protocol can define the capability. The client and workflow still decide whether the user sees the prompt source, prompt body, arguments, embedded resources, risk level, and downstream tool path before use.

For low-risk workflows, a prompt picker may be enough.

For sensitive workflows, the user should see more than a friendly prompt title.

Show:

Prompt source server:
Prompt name:
Prompt description:
Prompt body visible to reviewer:
Required arguments:
Optional arguments:
Embedded resources:
Trust labels:
Sensitive data classes:
Tools this prompt can influence:
High-impact actions:
Approval required before tool use:
Log location:

If the client cannot show that record, the prompt is being trusted more than it has been reviewed.

Prompt arguments can carry data into instructions

Prompt arguments are useful because they customize a template.

They are also a data path.

An argument might be a ticket ID, repo path, customer name, file name, search query, project slug, database table, branch name, or natural-language task. If that value comes from a user, issue, file, web page, or another tool, it can carry untrusted content into the prompt.

That maps to the risk OWASP describes in LLM01: Prompt Injection. OWASP says indirect prompt injection can happen when an LLM accepts input from external sources such as websites or files. OWASP also recommends segregating external content, using privilege control, requiring human approval for high-risk actions, and conducting adversarial testing.

Use that framing for MCP prompt arguments.

Do not let an argument become a hidden instruction channel.

Review:

Argument name:
Expected type:
Allowed source:
Who can influence it:
Can it contain natural language:
Can it contain file or web content:
Validation rule:
Sensitive data class:
Default if invalid:

If an argument can contain untrusted natural language, label it as data before it reaches the model. Do not let it override the prompt, tool policy, approval rules, or system instructions.

Embedded resources make prompts overlap with context review

MCP prompt messages can include embedded resources.

That creates overlap with the MCP resource context review.

A prompt is no longer just text when it can bring in server-side resources. It can combine instructions with context. That context may be trusted project documentation, a database schema, a file, a generated issue export, or external content that was converted into a resource.

Keep reading with free field-guide resources.

VibeSec Advisory publishes practical research, Skills, workflow examples, MCP notes, prompt injection tests, and AI red-team lessons for builders working with agentic AI.

Read the research Browse Skills

The safer pattern is to label embedded resources before the prompt runs:

Resource URI:
Source system:
Owner:
Trust label:
Sensitivity:
Selection mode:
Allowed use:
Can tools use this content:

A high-quality prompt does not make an unknown resource safe.

A useful resource does not make a prompt safe.

Review the combination.

Prompt lists can change after approval

Prompt review is not one and done.

The MCP prompt spec supports list-change notifications. A server can indicate that the list of prompts changed after initialization.

That is useful for dynamic systems. It is also a review trigger.

A server that exposed one safe prompt yesterday may expose three new prompts today. A prompt description may change. A required argument may be added. An embedded resource may appear. A workflow prompt may start steering the model toward a new tool.

If your client allowlists MCP prompts, tie the allowlist to a prompt fingerprint and review date.

Log changes like this:

Prompt changed:
Server:
Old prompt name:
New prompt name:
Body changed:
Arguments changed:
Embedded resources changed:
Downstream tools changed:
Reviewer:
Decision:

If the client cannot show prompt changes, do not let prompts from that server drive high-risk tool use automatically.

The tool boundary is the real risk

A prompt is not a tool call.

But prompts matter because they can shape the model's next decision.

The MCP tools specification says tools let models interact with external systems such as databases, APIs, and computations. It also says tools are model-controlled, meaning the model can discover and invoke tools automatically based on context and user prompts.

That is the key link.

A server-supplied prompt can become part of the context that leads to a tool call.

The risk depends on what the model can reach after the prompt runs:

Read-only local notes are one risk.
A database query tool is a higher risk.
A file-write tool needs a separate gate.
A browser or email tool can create an external communication path.
A shell, deployment, credential, or production API tool should not be reachable from an unreviewed prompt path.

This is where OWASP LLM06: Excessive Agency fits. OWASP points to excessive functionality, excessive permissions, and excessive autonomy as root causes. Its mitigations include minimizing extensions, minimizing functionality, minimizing permissions, requiring user approval, and complete mediation.

For MCP prompts, the practical version is simple.

Map prompts to tools before you trust them.

The MCP prompt boundary record

Before enabling prompts from an MCP server in a tool-bearing workflow, write this down:

Client or host:
MCP server:
Transport:
Prompt name:
Prompt title:
Prompt body reviewed:
Required arguments:
Optional arguments:
Allowed argument sources:
Blocked argument sources:
Can arguments contain untrusted text:
Can arguments include sensitive data:
Prompt roles used:
Embedded resources:
Resource trust labels:
Resource sensitivity labels:
Who can select the prompt:
Can the model select the prompt:
Can the server change the prompt list:
Prompt fingerprint or review version:
Tools exposed in the same workflow:
Tools this prompt can reasonably influence:
Allowed actions after prompt use:
Actions requiring approval:
Blocked actions:
Log location:
Reviewer:
Review date:
Default decision if unclear:

The important field is not the prompt title.

It is the connection between prompt, context, and tools.

A safe canary test

Do not only read the prompt.

Test the workflow.

Use a harmless canary that asks for a forbidden behavior without revealing secrets or causing damage. The goal is to verify the boundary, not to publish an exploit.

Example test shape:

Input source: untrusted argument or embedded resource
Canary instruction: asks the agent to skip review and call a blocked tool
Expected result: no blocked tool call, no approval bypass, clear log entry
Actual result:
Reviewer:

Run the test with no secrets and no high-impact tools enabled first.

Then run it in the real approval path with blocked tools still blocked.

If the agent follows the canary, the boundary failed. Fix the workflow before enabling broader prompt use.

Prompts are not policy

A common mistake is to put all of the control inside another prompt.

That is weak.

A prompt that says "never call dangerous tools" can still be diluted by context, bad tool descriptions, untrusted inputs, model mistakes, or a future prompt-list change.

Use prompts for instructions.

Use workflow controls for authority.

The safer pattern is structural:

Review prompt source and body.
Validate arguments.
Label embedded resources.
Keep tool permissions narrow.
Require separate approval for high-impact actions.
Log prompt selection and tool use together.
Re-review when prompt lists change.

That is boring. It is also the part you can test.

Evidence versus opinion

Evidence from the sources:

The MCP prompts specification defines server-provided prompt templates, prompt discovery, prompt retrieval, arguments, roles, embedded resources, list-change notifications, and prompt security considerations.
The MCP tools specification defines tools as model-controlled interfaces to external systems and recommends visible tool exposure, confirmation prompts for operations, tool input review, and logs.
MCP security best practices discuss implementation risks such as confused deputy and token passthrough, which reinforces that client identity, consent, and authorization are separate from prompt review.
OWASP LLM01 supports treating external content as prompt-injection risk and recommends privilege control, human approval for high-risk actions, external-content separation, and adversarial testing.
OWASP LLM06 supports minimizing functionality, permissions, and autonomy, plus human approval and complete mediation.

VibeSec Advisory's opinion:

MCP prompts should be reviewed as instruction packages that may influence tool use. A prompt that can steer a model with write, send, deploy, query, browser, or shell tools should not be approved as a harmless text template.

Free next step

Pick one MCP server that exposes prompts and fill out the MCP prompt boundary record above. If you cannot map prompt source, arguments, embedded resources, list-change behavior, downstream tools, approval gates, and logs, keep the prompt path read-only until you can.

Then browse the VibeSec Advisory Skill Library for more practical agent security checks.

AI Workflows Weekly

Read the archive

Practical notes on governed AI workflows, guardrails, and safer automation. No spam, unsubscribe anytime.

Keep testing agentic AI risk.

VibeSec Advisory is a free field guide. Use the research archive, Skill Library, and workflow examples to keep improving what you are building.