SOP Playbook Builder // Pro Edition // Angie Bailey

The Structural Failure

The Two Problems
Nobody Talks About

Most SOP guides tell you to "write down what you do, step by step." That advice is technically correct and practically useless. It fails because of two problems that compound each other.

Problem 1: Semantic Compression. When you translate a rich, contextual, multi-sensory process into linear written instructions, you lose information. A lot of it. Think about it this way. You're a chef who's been making a particular dish for ten years. Someone asks you to write the recipe. You write "sear the steak until browned." But what you actually do involves: assessing the marbling to decide how hot the pan needs to be, listening to the sound of the sizzle to know if the pan was hot enough, watching the color change at the edge of the meat, pressing the surface to gauge doneness by resistance and making a split-second call about when the Maillard reaction has gone far enough based on smell. "Sear until browned" compresses all of that into three words. The gap between what you write and what you actually do is the compression gap, and it's where SOPs go to die.

Problem 2: Proprioceptive Labor. Proprioception is your body's awareness of itself in space. Proprioceptive labor is the cognitive equivalent: the invisible work your brain does automatically because you've done this process so many times. It includes pattern recognition you can't articulate ("this lead feels off"), environmental scanning you don't notice you're doing ("I always check Slack before starting this"), quality assessments that happen below conscious awareness ("something about this draft isn't right") and branching decisions you make so fast they feel like instinct rather than logic. You can't write down what you can't see yourself doing.

Before You Start

Scope the
Process

Before capturing anything, answer five questions. Write the answers down... they become the header of your SOP.

01

What is the goal?

State the outcome in one sentence. Not what you do, but what's true when you're done. "The client receives a qualified lead report with a fit/no-fit recommendation and supporting evidence" is a goal. "Process incoming leads" is not.

02

What triggers this process?

What event, signal or condition causes you to start? Time-based ("every Monday morning"), event-based ("when a new lead enters the CRM") or request-based ("when a client asks for a deliverable"). "When needed" is not a trigger.

03

What inputs do you need before you can start?

Every piece of information, access, file or resource you need in hand before step one. If you frequently start and then realize you're missing something, that thing is an input you forgot to list.

04

What does done look like?

Not just the format (a document, an email, a dashboard update) but the qualities it must have. "A 2-page brief with three strategic recommendations, each supported by at least one data point, written in the client's preferred terminology" is a done state.

05

Who (or what) will execute this?

A human with domain expertise needs less scaffolding than a human without it. An AI agent needs the most explicit instructions of all but also benefits from examples and constraints more than narrative explanation. Design for the least-context executor you'll use.

The System

The 4-Layer
Capture Stack

You will run through your process four times, each time with a different lens. Each layer captures something the others miss. Skipping layers produces SOPs with invisible holes.

Layer 1

Conscious Narration

What you're doing: Writing out the process as you'd naturally describe it to a competent peer.

Sit down and write out every step from trigger to done state. Don't overthink it. Write the way you'd explain it to someone smart who's never done this specific task.

The key discipline here is granularity. If you think your process has 3 steps, it probably has 8-12. A useful test: if a step takes you more than 5 minutes to do, it's probably multiple steps compressed into one. Break it apart.

For each step, write:

What you do (the instruction)
What you're trying to achieve (the purpose)
What the output of this step looks like (the deliverable)

Don't worry about being perfect. This is Layer 1... it's supposed to have gaps. The next three layers exist specifically to find and fill those gaps.

What this captures: The conscious, articulable portion of your process... typically 40-60% of what you actually do.

Layer 2

Failure Forensics

What you're doing: For each step, describing what wrong looks like instead of what right looks like.

Why this works: Experts struggle to articulate their quality standards in the positive ("what makes this good") but can easily recognize and describe failure ("this is wrong because..."). Failure patterns are more salient, more specific and more actionable than success descriptions.

Go through each step from Layer 1 and answer:

What does a bad version of this step's output look like? Describe 2-3 specific failure modes.
What's the most common mistake someone new makes here?
What would make you redo this step if you saw the output?
Is there a failure here that's subtle enough that someone might not catch it and it causes problems downstream?

Write these as "If you see X, that means Y went wrong" statements. These become your success criteria (inverted) and your quality gates.

Example:

Step: "Draft the executive summary"

Layer 1 instruction: "Summarize the key findings in 3-4 paragraphs for a senior audience"

Layer 2 failure modes:

"If the summary is longer than half a page, you included too much detail... it should be findings and implications only, not methodology"

"If every paragraph starts with 'The data shows,' you're writing a report, not a summary... lead with the business implication, not the data"

"If someone who hasn't read the full report can't make a decision from the summary alone, it's missing the 'so what'"

What this captures: Implicit quality standards, edge cases and the difference between "technically correct" and "actually good."

Layer 3

Self-Shadowing

What you're doing: Actually performing the process in real time while narrating every micro-decision, hesitation and environmental check you make.

This is the most important layer. It's also the most uncomfortable, because it forces you to watch yourself think.

Set up a way to capture your narration while you work:

Voice memo on your phone (transcribe later)
Screen recording with audio narration
A colleague watching you work and writing down what they see you do that isn't in your Layer 1 writeup
A running notes document open next to your work

Then do the process for real. Not hypothetically. Not from memory. Actually do it, start to finish, while narrating:

"I'm checking X before I start because..."
"I'm pausing here because something doesn't look right... specifically..."
"I just skipped ahead to check Y before continuing because..."
"I almost did X but then changed my mind because..."
"I'm going back to adjust the thing from step 2 because now that I see step 4's output..."

The Delta Protocol: After self-shadowing, compare your Layer 3 narration against your Layer 1 writeup. Every action, check or decision that appears in Layer 3 but not Layer 1 is part of the proprioceptive layer... the invisible work you do automatically.

This delta is the most valuable part of your SOP. Common things that surface:

Pre-checks you run before starting a step (environmental scanning)
Quality micro-assessments between steps (is this good enough to proceed?)
Conditional branches you take without thinking
Tool switches or workarounds you've developed
Reference materials you consult that aren't in your instructions
Backtracking patterns (going back to revise earlier work based on later discoveries)

What this captures: The 40-60% of your process that lives below conscious awareness... the proprioceptive labor.

Layer 4

Decision Tree Extraction

What you're doing: Taking every "I just know" moment from Layer 3 and forcing it into explicit if/then logic.

Why this matters: Tacit knowledge isn't magic. It's pattern recognition that's been repeated so many times it's become automatic. The patterns ARE logical... you've just stopped seeing the logic. Decision tree extraction makes that logic visible, testable and teachable.

Take each proprioceptive item from Layer 3's delta and ask:

What am I actually looking at when I make this judgment? (The input signal)
What are the possible states of that signal? (The conditions)
What do I do differently based on each state? (The branches)
What happens if I get it wrong? (The stakes... this helps calibrate how much precision the rule needs)

Structure each one as:

WHEN [situation/signal] IF [condition A] → THEN [action A] IF [condition B] → THEN [action B] ELSE → [default action] STAKES: [what goes wrong if this decision is wrong]

Not every judgment call will reduce cleanly to rules. That's fine. For the ones that resist, use Compression Markers (next section) to flag them honestly rather than pretending they don't exist.

What this captures: The decision logic that experts have internalized to the point of invisibility.

Honest Incompleteness

Compression
Markers

Semantic compression is inevitable. The goal isn't to eliminate it... it's to compress deliberately, marking where fidelity was lost so future users (or future you) know where the SOP is approximating rather than specifying.

[COMPRESSED]

"I simplified this section intentionally. Here's what I left out: ___." Use when you've deliberately reduced complexity for readability or because the full version would be too long. Include a note about what was compressed and where to find the full version if needed.

[TACIT]

"There's knowledge here I can't fully articulate yet." Use when Layer 3 surfaced something you do but Layer 4 couldn't reduce it to rules. This is an honest acknowledgment that the SOP is incomplete at this point. It tells the executor: "You may need to ask the expert for guidance here until this gets resolved."

[CONTEXT-DEPENDENT]

"This step changes based on conditions I haven't fully mapped." Use when you know the step varies but haven't catalogued all the variations. Include the variations you have identified and a note that others exist.

[JUDGMENT CALL]

"This requires human judgment that I haven't reduced to rules." Use when the decision genuinely requires expertise, taste or political awareness that can't be codified. This marker explicitly tells AI executors: "stop and escalate to a human here."

These markers aren't failures. They're the most honest parts of your SOP. An SOP with no compression markers is almost certainly lying about its completeness.

Assembly

Structuring
the SOP

Once you've run all four layers, you have raw material. Structure it into the companion template. Each step includes fields for:

Step Name ... A verb-first action label. "Qualify the lead" not "Lead qualification."
Instructions ... Your Layer 1 narration, refined. Write for the least-context executor you'll use.
Failure Modes ... Your Layer 2 output. 2-4 specific ways this step goes wrong.
Proprioceptive Notes ... Your Layer 3 delta items for this step. The things you do that you wouldn't have written down.
Decision Logic ... Your Layer 4 if/then structures for this step.
Success Criteria ... Inverted from your failure modes. "The step is done correctly when [none of the failure modes are present] AND [these positive conditions are met]."
Example ... One good output and one bad output. The contrast is more instructive than either alone.
Response Format ... What the output of this step should look like structurally. A template, a schema, a sample.
Compression Markers ... Any [COMPRESSED], [TACIT], [CONTEXT-DEPENDENT] or [JUDGMENT CALL] flags for this step, with notes.
Configuration Notes ... Tool links, automation references, environment setup. The "where and how" rather than the "what."

Validation

Testing and
Validation

An SOP you haven't tested is a hypothesis, not a procedure.

Test 1

Cold Run

Give the SOP to someone who has never done this process. Watch them attempt it without helping. Every place they get stuck, ask a question or do something wrong is a gap in your documentation. Fix the SOP, don't coach the person.

Test 2

Expert Review

Give the SOP to another expert (if one exists). Ask them: "What's missing? What would you do differently? Where would this fail?" Their disagreements with your approach are either errors to fix or legitimate variations to document.

Test 3

Edge Case Sweep

Think of the three weirdest, most unusual instances of this process you've ever encountered. Walk through the SOP with each one. If the SOP doesn't handle them, decide: add a branch, or add a [CONTEXT-DEPENDENT] marker with a note about what to do.

Test 4

AI Dry Run

If this SOP will be used by AI, convert each step into a prompt and run it. The AI's mistakes are diagnostic... they tell you exactly where your instructions are ambiguous, incomplete or assume knowledge that isn't in the SOP.

Deployment

Making It
Executable

Once the SOP is stable, you can deploy it in multiple ways. The core principle: own the SOP, choose the executor. The process definition is the asset. The execution layer is interchangeable.

For Human Executors

The SOP as-is should be sufficient. Hand it off with the expectation that they'll flag gaps and you'll refine together.

For AI Agents (Prompt-Based)

Each step becomes a prompt. The step's Instructions become the main directive, Success Criteria become the evaluation rubric, Examples become few-shot demonstrations and Response Format becomes the output schema. [JUDGMENT CALL] markers become "stop and ask the user" instructions.

For Automation Tools

Each step becomes a module or action in your tool of choice (Zapier, Make, n8n, etc.). Decision Logic maps directly to conditional branches. Inputs and outputs map to data flows between modules.

For Agent Frameworks

The entire SOP can become a system prompt or skill definition. Compression Markers help you decide which steps to give the agent autonomy on and which to gate with human review.

Reference

Quick
Reference

01 Scope ... Answer the five scoping questions (goal, trigger, inputs, done state, executor)

02 Layer 1: Conscious Narration ... Write out what you think you do

03 Layer 2: Failure Forensics ... Describe what wrong looks like for each step

04 Layer 3: Self-Shadowing ... Do the process while narrating every micro-decision

05 Delta Protocol ... Compare Layer 3 against Layer 1; the gap is the proprioceptive layer

06 Layer 4: Decision Tree Extraction ... Force every "I just know" into if/then logic

07 Structure ... Fill in the companion template with outputs from all four layers

08 Mark Compression ... Add [COMPRESSED], [TACIT], [CONTEXT-DEPENDENT], [JUDGMENT CALL] markers

09 Test ... Cold run, expert review, edge case sweep, AI dry run

10 Deploy and Iterate ... Ship it, use it, improve it, repeat

Back to SOP Playbook Builder

The Two ProblemsNobody Talks About

Scope theProcess

The 4-LayerCapture Stack

CompressionMarkers

Structuringthe SOP

Testing andValidation

Making ItExecutable

QuickReference