Skip to main content
Back to all posts
5 minAgentic AI SecurityJune 21, 2026

Your AI Agent Needs a Capability Diff Before Tool Updates

Agent updates are not just code changes. Review tool lists, scopes, prompts, roots, dependencies, approvals, regression tests, and rollback paths before trusting updated agent workflows.

RM

Ryan Macomber

Editor, VibeSec Advisory

An agent update is not just a code change. It can change what the agent can read, write, call, remember, retrieve, and approve.

That is why AI workflow reviews need a capability diff.

Most teams already know how to review a dependency update. They look at the package, version, changelog, vulnerabilities, and test result.

Agent workflows need the same habit, but the diff has to cover authority.

What changed in the tool list?

What changed in the MCP server description?

What changed in the input schema, output schema, OAuth scope, filesystem root, service identity, system prompt, Skill file, memory rule, scheduler, or approval gate?

Those are not implementation details. They are the shape of the workflow's blast radius.

Keep reading with free field-guide resources.

VibeSec Advisory publishes practical research, Skills, workflow examples, MCP notes, prompt injection tests, and AI red-team lessons for builders working with agentic AI.

NIST's Generative AI Profile frames generative AI risk across the lifecycle and highlights governance, provenance, pre-deployment testing, and incident disclosure as core considerations. That maps cleanly to agent workflows: keep a record of the approved state, test before promotion, and preserve evidence when the system changes.

OWASP's Excessive Agency guidance is even more direct. The root causes it names are excessive functionality, excessive permissions, and excessive autonomy. A tool update can add all three without looking like a major launch.

MCP makes this practical. The specification says MCP servers expose tools, resources, and prompts, while clients may expose roots, sampling, and elicitation. Tool lists and root lists can change. Tool descriptions and annotations can influence model behavior. The spec also warns that tool descriptions should be treated as untrusted unless they come from a trusted server.

So the review should be boring and specific.

Before an updated agent runs in production, ask:

  • Did any tool get added, removed, renamed, or changed?
  • Did any tool description, annotation, input schema, or output schema change?
  • Did any OAuth scope, token audience, service account, database role, or filesystem root change?
  • Did any prompt, Skill file, memory rule, approval rule, scheduler, or escalation path change?
  • Did any dependency, package, server, connector, action, or container version change?
  • Did the workflow gain a new read path?
  • Did it gain a new write path?
  • Did it gain a new external communication path?
  • Did regression tests cover the changed authority?
  • What evidence will the reviewer see?
  • What is the rollback path?

OpenAI's Agents SDK docs separate automatic guardrails from human review. Guardrails can validate input, output, and tool behavior. Human review should pause the workflow before sensitive actions, shell commands, edits, cancellations, and sensitive MCP actions.

That distinction matters. A guardrail can catch known bad patterns. A reviewer decides whether the new authority is acceptable.

The control pattern is simple:

  1. Freeze the last approved state.
  2. Generate the capability diff.
  3. Classify changes by authority: no new authority, read authority, write authority, or control authority.
  4. Run regression cases against the changed boundary.
  5. Require approval for higher authority.
  6. Store the diff, reviewer, decision, test result, and rollback path.
  7. Disable or revert the new capability when the diff contains a surprise.

NIST SSDF recommends toolchain audit trails and monitoring tool logs for policy violations or anomalous behavior. SLSA frames provenance as verifiable information about where, when, and how an artifact was produced. GitHub's dependency review is useful because it shows dependency changes and security impact at pull request time.

Agent capability review should borrow that operating model.

Do not approve an agent update by reading the changelog alone.

Diff the authority.

AI Workflows Weekly

Read the archive

Practical notes on governed AI workflows, guardrails, and safer automation. No spam, unsubscribe anytime.

First-party signup with double opt-in. No embedded newsletter iframe, no analytics cookies, and unsubscribe anytime.

Keep testing agentic AI risk.

VibeSec Advisory is a free field guide. Use the research archive, Skill Library, and workflow examples to keep improving what you are building.