I noticed something early in my AI ops build-out that bothered me.

The AI could pull any data I asked for. It could summarize. It could reformat. But when I asked it to diagnose a performance issue — “CPA jumped from $37 to $59 last month, why?” — the output was almost always junior-level.

Surface observations. First-layer metrics. Pull the cost. Pull the conversions. Note the ratio. Observation: CPA went up. Recommendation: maybe try different bids. Done.

That’s not analysis. That’s data restating. Any junior PPC could run those queries. What a senior strategist does is different, and I couldn’t quite articulate why — until I started paying attention to my own process.

What separates senior analysis from generic AI output isn’t the data. It’s the thinking that happens before the data gets pulled.

That realization is what the investigation-methodology skill exists to encode. It’s not a query. It’s not a tool. It’s the actual framework I run through in my head when I open an underperforming account — hypothesis generation, layered evidence gathering, probability updates, explicit scope for the unknown — turned into something an AI can follow.

It’s open-sourced as part of my PPC AI Skills repo, and it’s the skill every one of my investigation agents loads before they’re allowed to run a single query.

Here’s what it does, why the framework matters more than any individual tool, and why this is the piece most PPC teams will never put into their AI stack — which is exactly why it separates the output.

Get the Investigation Methodology skill → github.com/fourteenwm/ppc-ai-skills/investigation-methodology

Free and open-sourced. Drop the SKILL.md into any Claude Code project in under a minute. No configuration required.

The Core Problem: AI Without Framework Is Just a Query Engine

The default behavior of any capable AI, when asked to investigate a performance issue, is the same: pull the obvious metrics, summarize what they show, suggest something plausible.

That’s not investigation. That’s observation.

Senior analysis involves three things AI doesn’t do naturally.

First, hypothesis generation before data gathering. A senior strategist lists 5-8 possible causes before pulling any data. This prevents confirmation bias — the “I think it’s a bid strategy issue” mental shortcut that makes you only look at bid strategy data.

Second, layered evidence gathering. A senior doesn’t dump all the data at once. They pull Layer 1, see what it suggests, then pull Layer 2 to narrow the hypothesis, and so on. Each layer focuses the next.

Third, explicit probability updates. A senior revises their working theory as evidence comes in. “Hypothesis 2 just got more likely. Hypothesis 4 just got eliminated.” This keeps the investigation honest.

An AI without this framework does none of these things. It pulls everything, summarizes, and guesses. The output looks fluent, because AI is fluent. But the reasoning underneath is shallow.

The fix isn’t a better AI. The fix is to give the AI the framework senior humans use, and require it to follow the process.

Step 1: Define the Problem Statement, Not the Symptom

Most performance complaints come in fuzzy.

“The account is down.” What does that mean? Cost? Conversions? CPA? Over what period? Compared to what baseline?

The first rule of the methodology is to refuse to start until the problem statement is specific.

Fuzzy: “Account is performing poorly.”

Specific: “CPA increased from $37.97 in December to $59.04 in January across the Search campaigns.”

If the user hands the AI a fuzzy statement, the skill tells the AI to ask clarifying questions before doing anything else. What metric. What baseline. What time period. Which campaigns.

This feels like overhead. It isn’t. A precise problem statement narrows the hypothesis space before the first query runs. The alternative is an investigation that wanders, generates mediocre hypotheses, and reaches an unsatisfying conclusion.

Step 2: Generate Hypotheses Before Looking at Data

This is the single biggest behavioral change the skill enforces.

Before pulling any data, the AI must list 5-8 possible causes, grouped by category.

Internal changes (things the account owner controls): Budget adjustments, targeting changes, keyword edits, bid strategy shifts, ad copy updates, landing page changes, new negatives.

External factors (things the account owner doesn’t control): Seasonality, competitor entry, market shifts, Google algorithm updates.

Measurement issues (the data itself is suspect): Conversion tracking changes, attribution model changes, data lag, apples-to-oranges campaign comparisons.

Each hypothesis gets a rough probability. “Budget was reduced — 30% likely. New competitor in auction — 20% likely.” These don’t have to be precise. They just have to exist.

Why this matters: without written hypotheses, the AI investigates the first idea that sounds plausible and ignores everything else. Written hypotheses force breadth. Breadth prevents the “I was sure it was the bid strategy” trap.

Step 3: Pull Data in Layers, Not All at Once

Senior investigators don’t dump data. They pull one layer, update their theory, pull the next.

Layer 1: Performance metrics. Cost, conversions, CPA by period. Apples-to-apples comparison. Update hypotheses.

Layer 2: Traffic quality. Impressions, clicks, conversion rate. Is traffic down, or is conversion down? Different root causes. Update hypotheses.

Layer 3: Segmentation. Device, geo, campaign, day-of-week. Where is the drop concentrated? Update hypotheses.

Layer 4: Change history. What changed in the account? When? Update hypotheses.

Layer 5: Tracking and attribution. Is the measurement itself reliable? Final check.

By Layer 4 or 5, the investigation usually has a clear diagnosis. If it doesn’t, the framework says so explicitly and moves to “inconclusive with remaining possibilities.”

The discipline here is resisting the urge to pull Layer 3 data before seeing Layer 1 results. Most AI investigations fail because they pull everything, get overwhelmed, and pattern-match on whichever number looks most unusual. Layered pulls force focus.

Step 4: Update Probabilities as Evidence Comes In

After every layer, the framework requires an explicit probability update.

Layer 2 finding: Conversion rate dropped 30% on mobile. Desktop held steady.

Hypothesis updates:
✅ Landing page or ad relevance issue — 40% → 65%
❌ Budget issue — 30% → 5% (traffic volume is normal)
⚠️  Seasonal demand — 15% → 10%

This step is what keeps the investigation from wandering. Every layer produces a decision about which hypotheses are more likely and which are out. By Layer 4, the probability distribution usually shows one dominant cause.

Without this update step, the AI treats every layer as independent data and ends the investigation with “here’s a lot of information, good luck.” The probability scorecard forces convergence.

Step 5: Be Willing to Say “Inconclusive”

The hardest rule, and the one that separates honest investigation from AI confabulation.

If the data doesn’t point to a clear root cause after Layer 4, the skill requires the AI to say so:

“The data is inconclusive. Here’s what I’ve eliminated: [list]. Here are the remaining possibilities that need manual investigation: [list].”

This is the rule most AI systems violate. They’d rather invent a plausible-sounding root cause than admit the data didn’t support one. That’s what produces the “it’s probably seasonality” default answer when the data says nothing of the kind.

A senior strategist is comfortable with “I don’t know yet.” The skill encodes that comfort.

What This Has Actually Prevented

Since I started loading investigation-methodology into every agent that runs diagnostics:

  • Jumping to conclusions. Cases where my old workflow would have stopped at Layer 1 and recommended the wrong fix. The framework kept the investigation open until Layer 3 or 4 revealed the actual cause.
  • Confirmation bias. I’ve caught myself starting an investigation convinced I knew the cause, only to have the hypothesis generation step surface a possibility I hadn’t considered — which turned out to be right.
  • Data dumps. The layered approach prevents the “pull everything, summarize everything” output that overwhelms rather than informs.
  • Manufactured root causes. Letting the AI say “inconclusive” instead of inventing a plausible story has caught issues where the real cause was measurement-level, not performance-level.

None of these are flashy catches. They’re the steady difference between analysis that diagnoses the actual problem and analysis that gives you a confident-sounding guess.

Get the Investigation Methodology Skill

Install in 30 seconds

→ View the skill on GitHub

Copy the SKILL.md file into your Claude Code project:

mkdir -p .claude/skills/investigation-methodology
curl -o .claude/skills/investigation-methodology/SKILL.md \
  https://raw.githubusercontent.com/fourteenwm/ppc-ai-skills/main/investigation-methodology/SKILL.md

Claude Code auto-loads the skill when any diagnostic or investigation task begins. No configuration required. Works with any AI harness that respects skill files — I built it for Claude Code but the framework is portable.

Free. Open-sourced. MIT licensed.

The full repo has dozens of other PPC AI skills I use in production every day — GAQL query patterns, mutation safety, SQR classification, impression share diagnostics, and more. All at github.com/fourteenwm/ppc-ai-skills.

The Bigger Point

Most of the AI-in-PPC conversation is about tools. Which skill. Which agent. Which model. Tools are the easy part. Every PPC manager using AI has roughly the same toolkit available.

The differentiator isn’t the tool. It’s the framework the tool operates inside.

Generic AI plus no framework gives you junior-level analysis. Generic AI plus a senior strategist’s framework gives you senior-level analysis. Same model. Same queries. Wildly different output.

That’s why I think investigation-methodology is the most underrated skill in my repo. It’s not a query builder or a campaign creator. It’s the part that makes every other skill produce work worth reading.

Give the AI the data. Give it the tools. Then give it the thinking.

Not in that order.