April 16, 2026
WitnessOps

How to Scope a Governed AI Engagement

A structured method for determining whether an AI system problem is reviewable, what inputs are needed, and what a scoped engagement should produce.

This framework is for determining whether a specific AI system problem is reviewable before committing to a review. It extracts the claim under examination, maps the trust boundary around it, and identifies whether the available evidence is sufficient to produce a verifiable finding — or whether the problem needs to be restated first.

When This Framework Applies

What Makes a Problem Reviewable

A problem is reviewable when it has a defined system boundary, a specific claim, an evidence artifact (or a clear absence of one), and a question that is answerable with current information. Remove any of those four elements and the problem may be real, but it is not yet reviewable.

Defined system boundary. The review has to be about something specific — a model, an agent pipeline, a governance layer, a decision surface. "AI governance in the organization" is not a boundary. "The approval workflow for the procurement agent in the vendor's deployed stack" is.

A specific claim. The system, the vendor, or the documentation must be asserting something. The claim can be affirmative ("all agent actions are logged") or structural ("the governance layer is independent of execution"). If there is no claim, there is nothing to evaluate against.

An evidence artifact or its documented absence. Evidence does not have to exist for a problem to be reviewable — a well-documented absence of evidence is itself a finding. What cannot be reviewed is a situation where neither evidence nor its absence has been established.

An answerable question. The question has to be resolvable with information that exists or can be requested. "Is this AI system safe?" is not answerable. "Does the signed receipt record include the authorization token at execution time?" is.

ReviewableNot Reviewable
BoundarySpecific agent pipeline in production"The AI layer"
Claim"All actions are logged with a signed receipt""The system is governed"
EvidenceReceipts exist but have not been verifiedNo artifact, no absence documented
QuestionDo the receipts contain the fields required to reconstruct the action?Is the vendor trustworthy?

Reviewable example: The system documentation states that all agent actions produce a signed execution receipt. A sample of receipts has been provided. The reviewable problem is whether the receipts contain sufficient fields to reconstruct the authorization chain for each action.

Not reviewable (as stated): "We want to understand whether the AI is making fair decisions." This is a real concern but not a reviewable problem until a specific decision surface is identified, the fairness claim is stated explicitly, and there is an artifact (model card, output log, decision record) to evaluate against.

Step 1: State the claim under review

Extract the actual claim before doing anything else. Claims are often embedded in documentation, architecture diagrams, contract schedules, or compliance certifications. Surface them explicitly.

Claims about outcome are the hardest to verify directly. If the claim is outcome-based, identify the structural or behavioral claims it depends on and review those instead.

Step 2: Map the trust boundary

Identify what the system controls, what it delegates, and what it assumes. Every trust gap in this map is a potential gap between the claim and the evidence.

QuestionWhat you are looking for
What does the system control end-to-end?The owned execution surface
What does it delegate to another system or provider?Third-party models, APIs, cloud infrastructure
What does it assume about the delegated components?Implicit trust, contractual guarantees, or nothing
What would an external party need to verify the claim?The minimum artifact set for independent review

See: Authority Is Not Execution — the distinction between a system that holds authority and the system that executes under it is where most governance claims break down.

See: Delegation Is Not Disappearance — delegating execution to a third-party model or tool does not remove accountability from the delegating system.

Flag every component in the boundary map where the system asserts governance but delegates execution. Those are the high-priority review points.

Step 3: Identify the evidence artifacts

For each claim identified in Step 1, determine what evidence exists and what that evidence can actually prove.

Artifact typeWhat it can proveWhat it cannot prove
Signed execution receiptThat a specific action was recorded at a point in time with a specific authorization stateThat the action was authorized correctly, or that the receipt schema captures all relevant fields
Audit logThat events were recorded by the logging systemThat the log is complete, tamper-free, or that it captures the full authorization chain
Architecture diagramThat a governance component exists in the designThat the component is implemented, active, or applied to the actions in question
Compliance reportThat a compliance process was completedThat the system behaves as described, or that the review covered the specific claim under examination

For each artifact:

If no artifact exists: document that absence explicitly. "No signed receipts exist for the actions in scope" is a finding, not a gap in the review.

Step 4: Scope the gap

The reviewable problem is the gap between what is claimed and what is verifiable. State it precisely. A well-scoped gap statement has three parts: the claim, the evidence state, and the specific question the review will answer.

Structure:

The system claims [X]. The evidence base is [Y]. The reviewable gap is: [Z].

Worked example:

The system claims all agent actions are governed. The governance layer is documented. No signed execution receipts exist. The reviewable gap is: the claim exceeds the evidence — governance is asserted structurally but there is no artifact that records whether the governance layer was applied at execution time.

Well-scoped gap: "The vendor's architecture diagram shows a policy enforcement layer between the agent and the external API. No execution receipts are available. The gap is: there is no artifact that confirms the policy layer was active during the actions in scope."

Poorly-scoped gap: "We don't know if the AI is really governed." This is not a gap statement — it is a concern. Convert it by identifying the specific claim, the available evidence, and the precise question.

A poorly-scoped gap produces analysis. A well-scoped gap produces a finding.

What a Scoped Engagement Produces

A properly scoped engagement produces a specific set of outputs. Not a report in the abstract — these items:

If an engagement cannot produce all seven of these items, the scope is not complete.

The Principle

A review that cannot be checked is not a review — it is an opinion with formatting. The value of a scoped engagement is that it separates what is verified from what is assumed, and it states both explicitly. An engagement that moves accountability produces a finding that a second reviewer could evaluate independently. An engagement that adds paperwork produces a document that terminates scrutiny instead of enabling it.


See also: How to Review a System for Trust Boundaries — the broader trust-boundary review framework this scoping method feeds into.