Two stores do five million dollars in revenue this year. Store A grew from three million to five million. Store B grew from four million to five million. The first looks like the better business until you build the cohort table and discover Store A is acquiring forty percent more first-time buyers each month and almost none of them come back. Store B is smaller in absolute growth but its month-six retention has climbed eight points year over year. Store A is a treadmill. Store B is a flywheel. The aggregate revenue line cannot tell you which is which. The cohort triangle can.
Most DTC operators have heard the word "cohort" in a deck and nodded. Very few have actually sat down with their orders export and built one. The piece you are about to read is the version I wish someone had handed me the first time I tried. It assumes you have a Shopify store, a working knowledge of spreadsheets or SQL, and twelve months of order history. By the end you will have a cohort revenue table, a cohort retention curve, and an opinion about whether your business is healthy. If you would rather skip the spreadsheet maintenance entirely, Sentinel builds and refreshes this view automatically - but you should still understand the math before you outsource it.
Why aggregate revenue is the lie operators keep believing.
If you only look at monthly revenue, you cannot distinguish a retention business from an acquisition business. A store running aggressive paid media can post record months for a year and a half while quietly burning every customer it touches. The dashboards will look fine until ad costs tick up two dollars and the entire model collapses inside a quarter. Cohorts surface that fragility before the market does.
A cohort is just a group of customers who share an acquisition month. Once you bucket every order by the customer's first-purchase month - not the order's month - the question shifts from "how much did we sell in March" to "how much is the March cohort worth at month one, month three, month six, and month twelve." Those are different questions, and only the second one tells you whether the business compounds.
This is the same lens that a good analytics consulting engagement will push you toward in the first week. It is also the lens behind every honest LTV-to-CAC ratio. If you cannot say with confidence what your six-month cohort revenue per acquired customer is, your CAC target is a guess wearing a suit.
What Shopify gives you natively, and where it stops.
Shopify ships a Customer Cohort Analysis report inside the Analytics tab on Advanced and Plus plans. It is genuinely useful as a first look. It buckets customers by first-order month and shows you cumulative revenue and retained customer counts across subsequent months. For a store doing under one million in annual revenue, it is enough to start.
It stops being enough quickly. Three specific limitations show up the moment you try to make a real decision with it.
First, you cannot segment. Shopify's report treats every customer in a cohort identically. You cannot ask "what does the March cohort look like for customers acquired through Meta versus through organic search," which is the only question that actually matters when you are deciding where to push spend. Second, the export is awkward. The report renders in the browser but does not give you a clean CSV of the underlying triangle, which means you cannot pivot, model, or layer in CAC. Third, the report is locked to revenue and customer count. It will not show you average order value by cohort age, second-order rate, or contribution margin per cohort. Those are the metrics that change pricing decisions.
So we build it ourselves. The good news is the math is shockingly simple - it is one self-join and a date arithmetic step. Every operator should be able to do this once.
Building your first cohort table from a Shopify export.
Open the Shopify admin, go to Orders, and export the orders CSV with at least the last twelve months of history. Twelve months is the minimum because anything shorter cannot tell you whether your retention is improving or just regressing toward a seasonal mean. Eighteen or twenty-four months is better. The export will hand you a row per line item, but for cohort math we want a row per order. Collapse line items by order_id and keep four columns: order_id, customer_id (or email if customer ID is missing), created_at, and total_price.
Step one: assign every customer to a cohort.
For each customer_id, find the minimum created_at across all of their orders. That is their acquisition timestamp. Truncate it to the first day of its month. That month is their cohort. In SQL this is a single window function; in Sheets it is a MINIFS followed by an EOMONTH. Every customer now carries a cohort_month tag that never changes. Even if they buy again two years later, they stay in their original cohort. That immutability is the entire point.
Step two: compute months-since-acquisition for every order.
For every order, you need an integer months_since equal to the number of whole months between the customer's cohort_month and the order's created_at month. The customer's first order sits at zero. An order placed forty-five days later sits at one. This is a DATEDIF in Sheets or a DATE_DIFF in BigQuery. Conceptually you are joining the orders table to itself: one row per order, joined to the customer's first-order date, with a date subtraction.
Step three: pivot revenue by cohort and months-since.
Now sum total_price grouped by cohort_month and months_since. The result is a triangle. Rows are cohorts. Columns are months of life. The cell at row "March 2025" and column "3" is the total revenue the March cohort generated in their fourth month of life. The triangle shape comes from the fact that the March 2025 cohort cannot have a column 12 yet if today is May 2026.
Step four: layer in retention.
Repeat step three but instead of summing revenue, count distinct customer_id per cohort and months_since. Then divide each cell by the cohort size at month zero. That gives you a percentage retention curve per cohort: of the customers acquired in March, what share placed at least one order in month three. This is your second triangle, and it is the one investors and operators will both ask about first.
That is the entire build. Two pivots, one date subtraction, no machine learning, no SaaS. If your data is clean it takes thirty minutes. If your data is messy - duplicate customer records, refunds counted as orders, B2B accounts mixed with DTC - clean it before you pivot, because cohort tables amplify garbage. A single B2B account placing a fifteen-thousand-dollar order in month nine of an otherwise normal cohort will make that cohort look elite when it is just one buyer.
Reading the triangle without lying to yourself.
A healthy DTC cohort table has three properties. Recent cohorts look at least as good as older ones at the same age. Month-three revenue per customer is a meaningful fraction of month-one revenue. And the curve flattens rather than crashing. Below is what a usable cohort revenue-per-customer table looks like for a hypothetical apparel store. Numbers are revenue per acquired customer, not totals.
| Cohort | M0 | M1 | M2 | M3 | M6 | M12 |
|---|---|---|---|---|---|---|
| Mar 2025 | $78 | $14 | $22 | $26 | $31 | $24 |
| Jun 2025 | $82 | $16 | $24 | $28 | $33 | - |
| Sep 2025 | $71 | $11 | $15 | $18 | - | - |
| Dec 2025 | $94 | $9 | $11 | - | - | - |
The March and June cohorts look healthy. Month-three revenue is roughly thirty-three percent of month-zero revenue, and the curve continues to add dollars through month six. The September cohort is a warning. Month-three revenue dropped to twenty-five percent of month zero, and the absolute dollars are lower across every column. The December cohort is the loudest signal in the table: month-zero revenue jumped to ninety-four dollars per customer because of holiday discounting, but month-one and month-two are the worst in the entire history. That is the classic "we bought revenue with a discount and the customers never came back" pattern.
The rule of thumb that has held up across the brands I have worked with: if month-three revenue is less than thirty percent of month-one revenue for a cohort, you have a retention problem, not a customer-acquisition-cost problem. Lowering CAC will not save you. The customers you are acquiring are not the customers you want.
"Aggregate revenue tells you what happened. Cohorts tell you what is going to happen."
The four operator questions cohort data actually answers.
Once the triangle exists, four questions become answerable in minutes instead of arguments. These are the questions to bring to the next leadership meeting.
One: is our retention improving or degrading over time?
Read down a single column - say, month three - across every cohort. If month-three revenue per customer climbs from cohort to cohort, your product, onboarding, or post-purchase flow is working. If it falls, something has changed in the customers you are acquiring or the experience you are delivering. This is the single most important time-series in your business and almost no Shopify dashboard surfaces it.
Two: what is our true twelve-month LTV by acquisition channel?
Split the cohort table by acquisition source. You will need to join your orders to your first-touch attribution data, which is its own problem - see the attribution gap piece for that. With channel-segmented cohorts, twelve-month LTV stops being a single store-wide number and becomes "Meta cohorts return $164 by month twelve, organic returns $241, affiliate returns $89." Now you can set channel-specific CAC ceilings instead of a single blended ceiling that quietly subsidizes the worst channel.
Three: are discounts buying us customers or just margin compression?
Tag each cohort with its average first-order discount rate. Compare month-three retention between high-discount and low-discount cohorts. The honest answer in almost every brand I have looked at is that customers acquired with greater than thirty-percent first-order discounts retain at roughly half the rate of full-price acquisitions. That changes how you think about every Black Friday plan.
Four: how long until a cohort pays back its acquisition cost?
Take cumulative cohort revenue per customer, multiply by your contribution margin, and find the month where it crosses your blended CAC. That month is your payback period. Operators who fund growth from operating cash flow need this under six months. Operators with patient capital can run it to twelve. Operators who do not know the number are gambling.
Where Sentinel comes in.
Everything above can be done in Google Sheets the first time. The second time, with a fresh export, you will spend an hour. By the fourth time, you will quietly stop doing it, which is when the business starts drifting and nobody notices for a quarter.
We built Sentinel because cohort analysis is the kind of work that has to be continuous to be useful. Sentinel connects to your Shopify store, refreshes the cohort triangle nightly, segments it by acquisition channel using your existing attribution layer, and flags cohorts that deviate from your trailing twelve-month baseline. When the December cohort posts a month-one number twenty percent below trend, you hear about it the next morning instead of in a quarterly review.
The point is not that Sentinel does something a competent analyst could not do. It is that competent analysts have a thousand other things on their plate, and cohort discipline is the first thing that slips. Automating the boring half of the work - the export, the pivot, the refresh - is what keeps the strategic half of the work happening every week instead of every quarter.
What to do this week.
If you have not built a cohort table for your store, do it before Friday. Export twelve months of orders, build the two triangles, and answer the four questions. You do not need permission, you do not need a vendor, and you do not need a data team. You need a spreadsheet and an honest two hours.
Then bring the table to your next leadership meeting and ask the question that aggregate revenue cannot answer: are we building a business or running a treadmill. The triangle will tell you.
Frequently asked questions.
How much order history do I need before cohort analysis is meaningful?
Twelve months minimum, eighteen is better. Anything shorter and you cannot separate retention signal from seasonality. If you launched eight months ago, build the table anyway, but treat the conclusions as preliminary.
Should I cohort by week or by month?
By month for almost every DTC brand. Weekly cohorts produce too much noise relative to the size of a typical buying group. The exception is a high-velocity consumables brand doing six-figure weekly revenue, where weekly cohorts surface real signal.
Do I include refunded orders?
Subtract refunds at the cell level. A refunded order in month three should reduce that cohort's month-three revenue. Otherwise you will systematically overstate retention for cohorts with high return rates, which tend to be your discount-acquired cohorts - exactly the ones you are most likely to misjudge.
How does cohort analysis relate to LTV?
Cohort analysis is how you compute LTV honestly. The cumulative revenue per customer across columns of a cohort row, multiplied by contribution margin, is that cohort's actual LTV at each age. Anything else is an extrapolation. For more depth, see our piece on cohort-based LTV modeling.
Can I do this in Shopify alone or do I need a warehouse?
Sheets handles it through about one hundred thousand customers. Past that, get the data into BigQuery or Snowflake, because Sheets will start choking on the joins and the pivots will take minutes to refresh. If you are crossing that threshold, our AI development team has built the warehouse pipelines for several Shopify brands and the playbook is well-trodden.