How Anna Shows Her Work
You found the insight. Revenue from paid search is up 34% this quarter. You're about to put it in the deck for your VP.
Then the thought hits: but is it actually up? Or is this just noise?
This is the gap that keeps smart people from trusting their own data. Not the finding — the proof. You can see the number. You just can't defend it.
Observations vs. evidence
Most AI tools stop at observations. "Revenue appears to be higher in Q3." "Channel B seems to outperform Channel A." Appears. Seems. Weasel words dressed up as analysis.
An observation tells you what the data looks like. Evidence tells you whether you should bet on it.
The difference is statistical testing. When Anna finds a pattern, she doesn't just report it. She tests it. And she tells you the result in plain language — not in p-values.
A concrete example
Say you're a marketing manager comparing two acquisition channels — paid search and paid social — over the last six months. You upload the CSV. You ask Anna: Which channel is actually performing better?
Here's what Anna comes back with:
The observation: Paid search generated an average of $14,200/month in attributed revenue. Paid social generated $11,800/month. That's a $2,400 gap.
The evidence: Anna runs a Welch's t-test on the monthly revenue figures. The result: t = 3.18, p = 0.003, Cohen's d = 0.82 (large effect). She used Welch's t-test because the variance between channels wasn't equal. She translates it:
"Paid search outperforms paid social by $2,400/month on average. This difference is statistically significant — there's a 0.3% chance it's due to random variation. The effect size is large, meaning this isn't a marginal difference. It's a real gap."
That's the difference between "it looks like search is better" and "search is better, here's the proof, and here's how confident I am."
When Anna reports a finding, look for three things: the direction (up or down), the significance (p-value in plain English), and the effect size (how much it matters practically). All three together make a finding you can present without caveats.
Why this matters in the meeting
Your VP doesn't care about p-values. They care about making the right call.
But there's a difference between presenting "paid search revenue is higher" and presenting "paid search outperforms paid social by $2,400/month — statistically significant, large effect size, based on six months of data." The first invites challenge. The second invites a decision.
Anna gives you the backing. The statistical test is right there. The methodology is right there. The sample size is right there.
What happens when the data says "not sure"
This is the part most AI tools skip entirely.
Sometimes the difference between two groups isn't significant. Sometimes the sample is too small to draw a conclusion. Sometimes the trend is real but the effect size is trivial — statistically significant but practically meaningless.
Anna tells you that too. She'll say something like: "There's a 2.1% difference in conversion rate between the two landing pages, but it's not statistically significant (p = 0.23). You'd need about 4 more weeks of data at current traffic to detect a meaningful difference."
That's not a non-answer. That's a genuinely useful answer. It means: don't make a decision yet. Keep running the test. Come back when you have more data.
Knowing when you can't conclude something is just as valuable as knowing when you can. It prevents the premature call — pulling the budget from a channel that might actually be working, or shipping a landing page change based on two weeks of inconclusive data.
The confidence chain
Every finding Anna produces follows the same structure:
- What she found — the pattern, trend, or difference, stated plainly
- How she tested it — the statistical method, chosen based on your data's characteristics
- How confident she is — significance level and effect size, translated to English
- What it means for your decision — the practical implication, not just the statistical one
This chain is what turns an AI output into something you can present. Not because it looks impressive — because it's defensible. Your VP can push back on the number, and you can point to the test. Your stakeholder can ask "are you sure?" and you can say yes, and explain why.
The real fear
Nobody says this out loud, but the fear is simple: What if I present AI-generated numbers and they're wrong?
Fair. That's a career risk, not a data risk.
The answer isn't to avoid using AI for analysis. It's to use AI that shows its work. Anna doesn't ask you to trust her. She shows you the evidence and lets you decide. The statistical test is right there. The confidence interval is right there. The sample size is right there.
You're not presenting AI-generated numbers. You're presenting statistically tested findings that an AI helped you produce. There's a meaningful difference.
Your numbers are only as strong as the evidence behind them. Anna makes sure the evidence is there.
We use cookies to improve your experience. Privacy policy