Two Versions of the Same Campaign: Which One Actually Worked?
Most campaign post-mortems compare CPM and CTR and call it done. Anna joins engagement metrics with comment sentiment so the creative director can defend the work with the comments, not just the clicks
By Anna·~8 min read·Updated May 16, 2026
The campaign post-mortem deck is due Wednesday. The agency has two creative variants of the same launch in market — Variant A is the product-led cut, all hero shots and copy beats; Variant B is the narrative cut, a 40-second story about why the product exists. Both ran across the same platforms with the same budget and audience targeting. The CFO wants to know which one to keep running. The creative director wants to know which one to defend in the next round of pitches.
The standard answer is a screenshot of the ad platform dashboard, two columns of CTR and CPM, and a paragraph that says "Variant A outperformed on click-through, but we want to keep monitoring." Nobody is happy with this answer. The CFO does not see a decision. The creative director does not see the work. Neither person has any sense of why one variant did what it did.
The reason the standard answer is unsatisfying is that it is reading half the data. The metrics tell you what happened. The comments tell you why.
Why a CTR-versus-CPM comparison is not a post-mortem
The platform dashboards are built to optimise spend, not to evaluate creative. They surface CTR, CPM, ROAS, and a handful of completion metrics. These are all reasonable numbers. They are also profoundly incomplete for the question a creative team needs to answer, which is "did this variant build the brand we want to build."
Three blind spots stand out.
First, top-line engagement metrics flatten audience differences. A high CTR on Variant A might come from a different segment than a high completion rate on Variant B. The aggregate hides which audience preferred which variant — and the audience-by-variant split is where the next campaign brief lives.
Second, engagement and sentiment are independent variables. A variant can drive a lot of comments and most of them can be negative, or it can drive fewer comments and all of them can be high-intent. The platform dashboard cannot tell the difference. A 0.74% comment rate that is 90% skepticism is worse than a 0.32% comment rate that is 90% purchase intent, and both are invisible from the campaign view.
Third, completion rate and CTR often disagree. The platform reports both but rarely connects them. Variant B can have a higher completion rate (people watched it through) and a lower CTR (they did not click after watching). That is a different story than "Variant B underperformed" — it is "Variant B was watched, processed, and absorbed without an immediate click." Whether that is good or bad depends on whether you trust the creative to do long-arc work.
The honest post-mortem treats the comments as data, not as colour. Anna treats them that way by default.
What Anna does
Anna pulls both variants across every platform they ran on — Meta, TikTok, YouTube Shorts, and wherever else the campaign appeared. She joins the engagement metrics from the ad platforms with the comment text from the organic placements (and the paid placements that allowed comments). For each comment, she runs two =AI() columns: one classifies sentiment (Positive, Neutral, Negative), one classifies the comment theme — Aesthetic, Product question, Tagging a friend, Skepticism, Purchase intent, Off-topic.
The classification persists in the user's dataset. The creative director can see the prompt Anna used, edit the theme list (some campaigns want a "humour" theme, some want a "comparison-to-competitor" theme), and re-run the classification with the updated schema. This is what makes the analysis defensible — every theme label is auditable.
Then Anna indexes the engagement metrics across the two variants and lays the comment data alongside. The deliverable is a single-page report the creative director can take into the next planning meeting.
The headline read
Anna leads with the metric table. Not the chart, not the commentary — the table.
| Metric | Variant A (product-led) | Variant B (narrative) | Winner |
|---|---|---|---|
| Reach | 412k | 486k | B |
| CTR | 1.84% | 1.29% | A |
| CPM | $8.40 | $9.78 | A |
| Completion (≥75%) | 38% | 52% | B |
| Save rate | 0.92% | 0.78% | A |
| Comment rate | 0.74% | 0.32% | A |
| Comment positivity | 64% | 81% | B |
| Comments with purchase intent | 6% | 11% | B |
Variant A wins on CTR, CPM, save rate, and comment rate. Variant B wins on reach, completion, comment positivity, and purchase-intent comments. Calling one of them "the winner" without qualification is the part most campaign reports get wrong. Both are winners, on different axes, for different reasons.
The chart is the simplest version of the asymmetry. CTR and comment rate favour A. Completion favours B. CPM favours A by a hair. Anna's report places this chart up front because it forces the reader past the false binary.
The comment read
This is where the report stops being a metric comparison and starts being a creative diagnostic. The comment-theme distribution is genuinely different between the two variants, and the difference is the explanation.
The pattern is consistent across every brand Anna has analysed this kind of campaign for. The product-led variant generates questions — comments asking about price, sizing, materials, availability. It also generates skepticism — people pushing back on claims, comparing to alternatives. The narrative variant generates aesthetic reactions, tagging behaviour, and quiet purchase-intent comments ("ordered one" / "this is going on the list").
This is the part the CTR-only read misses. Variant A's higher comment rate is not a sign of stronger engagement; it is a sign of friction. People are commenting because they need information they did not get from the creative. Variant B's lower comment rate is not a sign of weaker engagement; it is a sign of resolution. The story closed the loop.
Whether you want friction or resolution depends on what stage of the funnel the variant is meant to serve. A consideration-stage placement should look like Variant A — questions are good, they signal active evaluation. A brand-building placement should look like Variant B — quiet absorption is the goal.
The post-mortem now has a real answer: A is the bottom-funnel variant, B is the top-funnel variant. Run both, but stop comparing them on the same axes.
Why this matters more than the agency thinks
Creative shops live in the gap between "the work tested well" and "the work did not move the metric." The gap is usually the comments. The agency made something that resonated; the platform dashboard does not measure resonance; the client cancels the variant.
Anna's report closes the gap by treating the comments as the primary qualitative signal. The creative director walks into the next meeting with a report that says, in plain language, what the audience said. Not anecdotes — quantified themes, sentiment ratios, and excerpts. The work gets defended on the basis of what it actually did, not on the basis of what the platform decided to measure.
The follow-up question that breaks every post-mortem open: "Show me the comments from each variant ranked by reach of the commenter." High-reach commenters are the audience the brand most wants. If Variant B's high-reach commenters are tagging friends and Variant A's high-reach commenters are asking when something will be back in stock, you have learned more about your creative than three rounds of platform metrics can tell you.
What a creative team does with this on Monday
The report changes the planning meeting in three ways.
The campaign brief stops asking for "a winning creative" and starts asking for "the variant we need for each funnel stage." This is how mature creative teams already think; the report makes it explicit and defensible.
The CPM-only argument loses its grip. The CFO can still see the CPM column, but it now sits next to the purchase-intent comment column. Both numbers are on the page. The decision conversation shifts from "which is cheaper" to "which is doing the work we hired it for."
The next round of variants gets briefed against the comment themes. If the brand wants more purchase-intent comments on the product-led variant, the brief asks for an edit that resolves more product questions on screen — sizes, materials, price points — instead of asking the audience to find out for themselves. The comment data becomes the brief.
What the deliverable looks like
A URL. The creative director opens it in the planning meeting. The CMO opens it before the budget review. The CFO opens it before the next quarterly. The same page, the same numbers, the same comment excerpts. Nobody rebuilds the deck.
It sits between the agency dashboard (which is too shallow to explain anything) and the brand-tracker survey (which is too slow to act on). It is the operational layer that creative teams have been writing into Google Docs by hand for years, finally automated and designed to be sent.
Frequently asked questions
How many comments do I need before the comment analysis is meaningful?
Around 500 comments per variant is the practical floor for the theme-level read. Below that, the smaller theme buckets (purchase intent, skepticism) get noisy. For most paid campaigns with reasonable spend, this threshold is cleared within the first week. Anna flags when a comment theme has too few examples to draw a confident conclusion.
What if the two variants did not run with equal spend?
That is the normal case. Anna normalises the engagement metrics against impressions and spend so the comparison is rate-based, not absolute-volume-based. The comment analysis is reported as percentages of each variant's total comment volume, so unequal sample sizes do not bias the theme split. The metric table shows both the raw value and the indexed-to-each-other ratio.
Can Anna do this for organic posts, not just paid?
Yes. The methodology is identical — pull engagement metrics, pull comments, classify, compare. The only difference is the data source: organic posts come from the platform's native analytics export rather than the ad platform API. For brands running paid-only and organic-only campaigns of the same creative, Anna will compare the two channels alongside the variant comparison, which is a useful additional cut.
What about negative comments — are they automatically a bad sign?
Not automatically. Anna distinguishes between negative sentiment that is skepticism (which often signals an active comparison and can drive conversion) and negative sentiment that is brand criticism (which is a different problem). The theme classification separates them. A campaign with 18% skepticism and 4% brand criticism is in a different position than one with 8% skepticism and 14% brand criticism. The headline negativity score is the same; the prescription is opposite.