Agent governance workflow library

Workflow failure label review

Label AI workflow failures, near misses, and eval misses before teams update Skills, tool contracts, memory, approval gates, or release decisions.

This is a complete workflow library with 5 individual skills. Download the full library or pick the specific skill folder your team needs first.

Download full library ZIP See individual skills

Individual skills in this library

Use one skill at a time, or keep the full workflow together.

Some AI tools expect one skill folder per upload. Download the full library when you want the whole workflow, or download an individual skill when you only need one job done.

Skill 1

Failure label intake reviewer

Use when a failed or risky AI workflow run needs a structured label record before the team updates a Skill, eval, memory item, tool contract, approval gate, or release decision.

Download individual skill GitHub source

Skill 2

Symptom and root-cause splitter

Use when a failed output, trace, or review note contains visible symptoms but the team needs to decide whether the likely fix belongs in the prompt, Skill, memory, tool, parser, source data, approval step, or workflow state.

Download individual skill GitHub source

Skill 3

Invisible failure finder

Use when a workflow appears to finish, pass, or satisfy the user but reviewers need to check whether the path included hidden mismatch, unsafe control decisions, missing confirmation, unobserved customer harm, or silent user walkaway.

Download individual skill GitHub source

Skill 4

Checkpoint route mapper

Use when a failure label needs to become the next checkpoint in the workflow: clarify, ask, confirm, stop, refuse, recover, human review, Skill update, tool contract update, memory review, parser change, or rollback.

Download individual skill GitHub source

Skill 5

Eval case converter

Use when a recurring failure label should become an eval scenario with expected safe behavior, blocked behavior, evidence boundary, approval route, and critical failure conditions.

Download individual skill GitHub source

Security fit check

Is the public Workflow failure label review library enough, or does this need deeper review?

Use the public library when the workflow is low-risk, the inputs are already sanitized, and a team member can review the output before it reaches a buyer or customer.

Do deeper review when this workflow touches real tools, data sources, role ownership, approval paths, or customer-facing output.

AI workflow evalsAI OperationsSecurityPlatform EngineeringWorkflow OwnerRevOps

Good deeper-review trigger signals

The workflow touches customer, prospect, CRM, proposal, security, pricing, or campaign data.
Different teams disagree on the approved source of truth.
The AI output could become customer-facing, revenue-impacting, or compliance-sensitive.
You need reusable eval checks before asking more people to use the workflow.

Review workflow examples