Tool Result Contracts for AI Agents

The tool result is not evidence until the workflow says how to verify it.

Most agent designs spend a lot of time on tool access. Which APIs can the agent call? Which MCP servers are installed? Which actions require approval?

That is necessary. It is not enough.

The next failure shows up after the call succeeds. The agent gets a search result, browser extract, API response, file read, or MCP tool output. Then it treats that output as ground truth, even when the result is stale, partial, ambiguous, sourced from untrusted content, or valid only for one narrow next step.

This is where teams need a tool result contract.

A tool result contract is the structured envelope that tells the agent what the result is allowed to prove. It should travel with the result before the model summarizes it, reasons from it, stores it in memory, or uses it to call another tool.

At minimum, the contract should answer seven questions.

Which tool produced this result?
Which source did the value come from?
Does the result match the expected schema?
When was it observed, and when does it expire?
Is this direct evidence, an inference, an absence claim, or an ungrounded claim?
Which future arguments may this result influence?
Does this result require human review before action?

The research is moving in this direction.

ProvenanceGuard argues that MCP-grounded agents need source-aware factuality checks because pooled evidence is not enough. A claim can be supported somewhere while still being attributed to the wrong source. That is a different failure from ordinary hallucination. It is source conflation.

PACT makes the security version of the same point. Untrusted content is not dangerous merely because it appears in context. It becomes dangerous when it binds an authority-bearing argument. A webpage can influence the body of a summary. It should not be allowed to choose the recipient, command, file path, credential, target URL, or control flag.

Keep reading with free field-guide resources.

VibeSec Advisory publishes practical research, Skills, workflow examples, MCP notes, prompt injection tests, and AI red-team lessons for builders working with agentic AI.

Read the research Browse Skills

Tool receipt research adds another useful primitive. The runtime can record the tool name, input hash, output hash, result count, extracted facts, timestamp, and receipt ID. Then the agent's claims can be checked against what the tool actually returned. That lets reviewers distinguish direct tool output from inference, absence claims, external citations, and unsupported statements.

ContractBench shows why this matters for ordinary API workflows too. Tool-returned artifacts like presigned URLs, OAuth state parameters, signed tokens, and webhook payloads often carry time and integrity rules. The model may preserve the general task while breaking the observation contract that makes the next step valid.

ToolBench-X makes the operational risk concrete. Tool environments can fail through specification drift, invocation error, execution failure, output drift, and cross-source conflict. A correct function call is only the beginning. The agent still needs to detect when the environment is unreliable and recover without inventing missing evidence.

The practical recommendation is simple:

Do not let agents consume raw tool results.

Wrap every important result in a contract.

A minimal version looks like this:

Identity: tool_id, server_id, source_id, call_id, trace_id
Structure: schema_id, validation_status, missing_fields, canonicalized_fields
Freshness: retrieved_at, last_modified, expires_at, ttl
Evidence: claim_type, receipt_id, direct_facts, derived_from
Influence: allowed_roles, forbidden_roles, argument_role_impacts
Error state: is_error, error_type, retryable, blocked_reason
Review: requires_human_review, review_reason, destructive_or_sensitive

The most important fields are the influence fields.

If a browser result came from an untrusted page, it may be allowed to influence the report body. It should not influence a shell command.

If a search result is stale, it may be allowed into a research note with a freshness warning. It should not update a customer record.

If an API response has a schema mismatch, it may be useful as a debugging signal. It should not become trusted workflow state.

This is the difference between tool access and governed workflow design.

Tool access asks, "Can the agent call this?"

A tool result contract asks, "What is this result allowed to change?"

That second question is where a lot of agent safety actually lives.

Sources

Model Context Protocol, Tools specification: https://modelcontextprotocol.io/specification/2025-11-25/server/tools
Model Context Protocol, Schema reference: https://modelcontextprotocol.io/specification/2025-11-25/schema
ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents: https://arxiv.org/abs/2606.18037
The Granularity Mismatch in Agent Security: https://arxiv.org/abs/2605.11039
Tool Receipts, Not Zero-Knowledge Proofs: https://arxiv.org/abs/2603.10060
ContractBench: Can LLM Agents Preserve Observation Contracts?: https://arxiv.org/abs/2605.17281
Beyond Function Calling: Benchmarking Tool-Using Agents under Tool-Environment Unreliability: https://arxiv.org/abs/2606.25819
Schema First Tool APIs for LLM Agents: https://arxiv.org/abs/2603.13404

AI Workflows Weekly

Read the archive

Practical notes on governed AI workflows, guardrails, and safer automation. No spam, unsubscribe anytime.

Keep testing agentic AI risk.

VibeSec Advisory is a free field guide. Use the research archive, Skill Library, and workflow examples to keep improving what you are building.

Write the Tool Result Contract Before the Agent Trusts the Tool

Keep reading with free field-guide resources.

Sources

Related Posts

Tool Results Are Agent Input, Not Ground Truth

Sandbox Profiles Before Shell Access for AI Coding Agents

Do Not Give Browser Agents Your Main Profile

AI Workflows Weekly

Keep testing agentic AI risk.