Agent Action Approval Matrix for AI Workflow Guardrails

A tool-connected AI agent should not get more access until the team knows exactly which actions it is allowed to take.

Short answer

An agent action approval matrix is a simple table that defines which AI agent actions are allowed, which need human review, and which are blocked. It should separate reading and drafting from sending, updating records, changing permissions, spending money, deleting data, publishing, or deploying. Build the matrix before adding tools, not after the first bad action.

Tool access changes the risk

A chat assistant that summarizes a document is one thing.

An AI agent that can open tools, read records, send messages, update systems, call APIs, or change permissions is different.

That is where a lot of AI governance gets too vague. The team says there is a human in the loop. The vendor says the workflow has guardrails. The policy says employees should use judgment.

None of that answers the operational question:

What is this agent actually allowed to do?

If nobody has written that down, the boundary usually gets decided by tool permissions, default settings, and whoever connected the integration. That is not governance. That is hoping the demo stays well behaved.

The matrix is not paperwork

An agent action approval matrix is a practical guardrail.

It lists the agent's possible actions and classifies each one:

Action status	Meaning
Allowed	The agent can do it inside the approved workflow and data boundary.
Review-required	The agent can prepare it, but a named human must approve before the action happens.
Blocked	The agent cannot do it unless the workflow is separately scoped and approved.

This sounds basic because it should be basic. Most teams do not need a 40-page AI policy before they can improve a workflow. They need to know whether the agent can only draft the customer email or whether it can send the customer email.

Those are not the same control.

Start with action classes

Do not start by asking whether the agent is safe. That question is too broad.

Start by asking what class of action the agent can take.

Action class	Default status	Example
Read	Allowed with data boundary	Search approved docs or approved records.
Draft	Allowed with review	Draft an email, ticket note, recap, or report.
Classify	Allowed with sampled review	Label tickets, leads, claims, or handoff status.
Recommend	Review-required	Suggest a next step, exception, or risk flag.
Prepare	Review-required	Prepare a CRM update, queue item, or change packet.
Send	Review-required by default	Send customer email, partner notes, support replies, or social posts.
Update	Review-required by default	Change CRM, billing, helpdesk, forecast, access, or knowledge base records.
Buy or approve	Blocked unless separately scoped	Approve discounts, terms, purchases, access, or policy exceptions.
Delete	Blocked	Delete files, tickets, records, accounts, contacts, or permissions.
Commit, publish, or deploy	Blocked unless separately scoped	Commit code, publish content, merge changes, deploy, or change production config.
Change permissions	Blocked unless separately scoped	Add users, change roles, grant scopes, or alter tool access.

That table will not cover every edge case. It does force the right conversation.

Reading is not sending. Drafting is not updating. Preparing a change is not applying a change. Recommending an exception is not approving an exception.

That separation is the point.

Why this matters for prompt injection and excessive agency

OWASP's LLM Top 10 calls out excessive agency as a major risk for LLM applications. Their guidance describes LLM systems that can use tools, skills, plugins, or other system interfaces. The risk increases when those systems have excessive functionality, excessive permissions, or excessive autonomy.

In plain English: an agent with too much access can do real damage when it gets confused, manipulated, or overconfident.

Want examples you can inspect?

The VibeSec Advisory Skill Library gives you inspectable workflow examples with review gates, data boundaries, and eval scenarios. Use it to see how AI workflow guardrails look before you build your own.

Browse the Skill Library Review workflow examples

OWASP's prompt injection guidance makes the same problem sharper. Direct or indirect prompt injection can alter model behavior. Indirect injection matters because an agent may read untrusted content from websites, files, tickets, emails, docs, or records. If that content can influence a tool-connected agent, it may push the agent toward unauthorized function use, arbitrary connected-system actions, or bad decisions.

Least privilege helps. Human approval helps. Neither one works well if the team has not defined which actions need approval.

That is what the matrix does.

A simple approval matrix example

Here is a practical version for a small business AI workflow.

Agent action	Data class	Default status	Reviewer	Evidence required
Summarize approved internal notes	Internal	Allowed	Workflow owner confirms source list	Source labels
Draft a customer follow-up email	Customer	Review-required	Account owner	Draft, source note, approval
Update a CRM next-step field	Customer and revenue	Review-required	CRM owner or account owner	Proposed change and rollback note
Send a support reply	Customer	Review-required	Support owner	Final reply and approval record
Approve a discount	Financial	Blocked	Finance or executive owner	Separate approval path
Grant tool access	Security	Blocked	IT or security owner	Access request and audit log
Delete a customer record	Customer and legal	Blocked	Legal, privacy, and system owner	Deletion request and recovery plan
Deploy production config	Security and operations	Blocked	Engineering owner	Change plan and rollback path

The exact rows will change by workflow. A recruiting workflow, sales workflow, support workflow, finance workflow, and engineering workflow should not share the same default action boundary.

But the pattern should be the same.

Name the action. Name the data class. Name the reviewer. Name the evidence. Decide whether the agent is allowed to act, prepare, or stop.

The review has to happen before the side effect

A human approval gate is only useful if it lets the person stop the risky action before it happens.

A log entry after the agent sends the wrong customer email is evidence. It is not approval.

A weekly review of accidental permission changes is monitoring. It is not a gate.

A manager clicking approve without seeing the source, proposed action, and rollback path is theater.

For review-required actions, the reviewer should see:

The proposed action.
The source records or source labels.
The data class involved.
The destination system.
The reason the agent recommends the action.
The rollback or correction path.
The decision options: approve, revise, block, or escalate.

If the workflow cannot show those items, the agent should stay in draft mode.

Where this fits in FORGE

This is a FORGE Guardrails problem.

Baseline identifies the workflow, connected tools, source systems, owners, data classes, and business metric.

Skills document how the team reviews drafts, recommendations, exceptions, and logs.

Agents define what automation can read, prepare, and attempt.

Guardrails classify each action as allowed, review-required, or blocked.

Schedule keeps the matrix current after tool changes, role changes, model changes, vendor changes, data migrations, and workflow changes.

Capture records approvals, blocked actions, exceptions, misses, and fixes.

That turns agent governance from a policy statement into a workflow habit.

What not to delegate by default

For most small and mid-market teams, these should be blocked unless separately scoped:

Changing permissions or access.
Deleting records or files.
Spending money.
Approving discounts, terms, refunds, exceptions, or contracts.
Sending sensitive customer, employee, legal, financial, security, or compliance messages.
Updating systems of record without a proposed change and rollback path.
Publishing public content without final human review.
Deploying or changing production systems.
Making rights-affecting, regulated, legal, medical, employment, credit, insurance, safety, or disciplinary decisions.

This is not anti-automation. It is how automation gets permission to survive contact with the business.

A practical next step

Pick one agent or planned automation. Write down the ten actions it could take.

Then mark each action as allowed, review-required, or blocked.

If the team cannot agree, do not add more tools yet. Map the workflow first. Clarify the data boundary. Name the reviewer. Decide what evidence must be kept.

If you want a simple starting point, the free workflow examples from VibeSec Advisory help map the workflow, identify data boundaries, and decide where approval gates belong before automation gets more access.

Pair the matrix with trace review so reviewers can compare approved actions against the agent's real tool calls, arguments, guardrail events, and final action.

Sources

OWASP: LLM06 Excessive Agency
OWASP: LLM01 Prompt Injection
NIST: AI RMF Core
NIST: AI RMF Govern Playbook
OWASP: LLM Applications Cybersecurity and Governance Checklist

AI Workflows Weekly

Read the archive

Practical notes on governed AI workflows, guardrails, and safer automation. No spam, unsubscribe anytime.

Keep testing agentic AI risk.

VibeSec Advisory is a free field guide. Use the research archive, Skill Library, and workflow examples to keep improving what you are building.

Your AI Agent Needs an Action Approval Matrix Before It Gets More Tools