When a store is doing poorly, there are literally 1,000 reasons why that can be. And it is usually death by 1,000 cuts.
He went on to list them:
- Inventory mix.
- Staffing turnover.
- A new competitor across the street.
- Shrink.
- Pricing changes someone forgot to communicate.
- Store manager reviews.
- Traffic.
- Crime in the area.
- Did the inventory not arrive on time, or did it arrive in the wrong price points.
I have heard the same thing from a pawn chain COO, a hotel management group, a real estate firm.
It is the most expensive, least solved problem in the entire category of multi-location operations analytics.
Nobody has ever managed to crack it at scale.
This is what I learned.
The 10-hypothesis problem nobody outside retail talks about
Here is the diagnostic load this strategy leader carries every week. When one store misses plan, his team has to run down a list that usually looks like this:
- Did the inventory arrive on time, in the right mix, at the right price points?
- Is the store manager new, on a performance plan, or carrying bad reviews?
- Is staffing complete, or are they running thin?
- Has shrink jumped?
- Did a competitor open inside the trade area?
- Is the traffic problem local, regional, or chain-wide?
- Has pricing on the top SKUs drifted out of line with the market?
- Is the assortment matched to the seasonal mix this area actually buys?
- Did we plan it this way, or did we do this to ourselves?
He said:
“There are usually 10 primary hypotheses, and a lot of work to figure out what is happening below the surface.”
Multiply that by the 500 to 1,000 locations a chain that size operates.
The math gets ugly fast.
A district manager cannot run 10 hypotheses by hand across 40 stores every week.
So most of the diagnostic work never happens.
The work that does happen lands too late.
This is what good retail analytics for operations leaders should solve, and what almost no BI tool actually does.
Dashboards do not tell you the why behind what is showing up. Monitoring tells you what; investigation tells you why.
The two are not the same job.
“By the time someone gets back, the moment has passed”
The quote that I keep replaying is this one:
Someone will ask the question. By the time someone is able to get back to them, the moment has passed. The moment has passed to be able to action it. Because actions take a long time to flow through a big, $12 billion company.
Read it twice.
It is not a complaint about analyst speed.
It is an admission about how the size of the org itself eats the window for action.
Imagine:
- A district manager flags a problem on Monday.
- A request goes to the analytics team. They route it. They pull the data. They cross-reference three systems.
- THEN, they write back.
- And lastly, by the time the answer arrives, two weeks have passed.
The inventory shipment that would have fixed the assortment problem has already been ordered. The competitor has already taken the customers. The store manager has already lost the staff.
This is what I now call:
Diagnostic Latency
Diagnostic latency is the gap between when a question gets asked and when an answer arrives in a form someone can act on.
In big chains, diagnostic latency is the silent killer.
It is not visible on any P&L line.
But it shows up everywhere else, including business performance reports.
A few honest observations about diagnostic latency from this conversation:
The big-org tax is real
The bigger the company, the longer the route from question to answer.
It compounds
Late diagnostics produce reactive plans, which produce more diagnostics, which arrive even later.
Hiring more analysts does not close it
You cannot brute-force this with headcount.
The way to close the gap is to move the diagnosis upstream.
Run the 10 hypotheses before the question gets asked.
Have the answer ready when the district manager opens the report Monday morning.
Why every generic AI POC has died on the same hill
He told me about a POC his team ran a few months ago.
A consulting firm offered to point an AI at his business and surface insights.
He gave them public data only. The output came back.
His verdict, almost word for word:
My team can do that. On their own. Like, nothing here. While it is faster, it is not incremental, and there is a huge cost difference.
That is the bar for retail AI.
Incremental perfomance beyond what the existing analytics team can already do.
Generic AI on public data fails that bar every time. Here is why:
- Public data has no store-level signal. It does not know your shrink, your assortment, your traffic mix.
- The model does not know how your business actually runs. Every retailer defines “comp store” differently. Every retailer has its own thresholds for what counts as “off.”
- Without that context, the AI surfaces the obvious. The analytics team can already see the obvious.
This is the failure mode the whole category keeps hitting.
It is also why AI investigation beyond the dashboard is a meaningfully different category from “chat with your data.”
Asking ambiguous questions of a generic model is not the same as investigating a specific store the way your best operator would.
That is where agentic analytics starts to earn its name.
The tribal knowledge problem (which nobody has written down)
The same leader said something else that hit a nerve:
Every business, the data means something else. You only use comp stores for this. You only use this for that. Hey, we’ve looked at this seven times, and this is explainable, or it’s not explainable.
That is tribal knowledge.
It lives in the head of the regional VP who has been in the chain for 18 years.
The COO who can walk into a store and feel what is wrong before he looks at the report.
It is not in any system. It is not in any data dictionary.
The reason it is not written down is because writing it down has historically been more expensive than just calling the person who knows.
Tribal knowledge in retail usually looks like:
- Definitions: “Comp store”, “opportunity store”, “underperformer” mean different things in different chains.
- Thresholds: What counts as a real signal versus normal noise.
- Pattern memory: “We’ve looked at this seven times, and the answer is always the same thing.”
- Hand-off logic: When to escalate, when to wait, when to send someone to the store.
That is the part of the analytics problem tacit knowledge research has been pointing at for forty years.
AI on its own cannot fix it.
The earlier conversation I wrote up with an ecommerce operator about BI pointed at the same gap from a different angle.
The bulk of the context you need to interpret a metric is not in the metric itself. It is in the heads of the people who run the business.
What changes when investigation runs on top of your stack, not under it
His other concern was practical.
A previous vendor had wanted to take his licensed data outside the company.
- Circana
- Traffic data
- Internal sales feeds
Renegotiating those licenses with every third party was a non-starter.
So the answer we have landed on is the opposite of the typical SaaS model.
The agents run inside the customer’s own AWS environment. We never take the data outside the perimeter. We never own it. Licensed data fits under the existing agreements, because no third party is touching it. Internal IT signs an NDA, not 12 new data-sharing contracts.
What this unlocks looks like this:
- Existing Power BI, Tableau, or data warehouse stays exactly where it is.
- The investigation layer runs on top, not instead of. (If you want the long version of why this matters, Tableau tells the story; Scoop explains the plot twist is the cleanest framing I have written on it.)
- 10 hypotheses run in parallel, every store, every week.
- Each store gets the equivalent of a Gartner analyst in a box, reading its own data, writing up its own findings.
- The district manager opens an inbox on Monday and reads the result. No login. No query building.
That is the difference between dashboards delivering charts and agentic BI delivering an actual data analyst.
The chart shows the dip.
The analyst tells you it is the new competitor on Memorial and Lincoln, that the SKU mix is wrong for the area, and that the store manager has been short two associates for three weeks.
The mechanism behind this kind of investigation is structured.
It is not a model running free.

The bigger pattern
I sat back after this call and made a list of the businesses where I keep hearing the exact same conversation.
The list looks like this:
- Retail chains with hundreds of stores.
- Hotel groups, especially the roughly 90 percent of hotels that are not owned by their brand and have to manage themselves.
- Property management firms with portfolios spread across regions.
- Real estate firms layering CRM with public market data.
- Multi-unit franchise concepts.
- Even non-location businesses like sourcing and QA, where the pattern (repeated decisions, distributed expertise, lots of data, no time) still holds.
The shared characteristics:
- Distributed decision-making
- Data-rich
- BI already in place
- A tribal-knowledge layer that does not scale
Every leader I talk to in those segments is being asked to do more from less.
- Team size is flat.
- Compression is permanent.
- The window of time to figure something out before it costs real money keeps getting shorter.
For a long time, the conventional wisdom was that AI would not work on this kind of problem because models do not understand specific businesses.
That was true.
It is no longer true when you give the model the right harness and the right encoded context.
This is the shift that makes augmented analytics qualitatively different from the BI generation that came before.
The technology was never the bottleneck.
Most BI projects fail not because the technology did not work but because nobody built the interpretation layer on top.
That interpretation layer is what is finally being built.
Frequently asked questions
What is store underperformance diagnosis, and why is it hard?
Store underperformance diagnosis is the work of figuring out why a specific store is missing plan. It is hard because there are usually 10 or more plausible causes (inventory, staffing, traffic, competitor activity, pricing, assortment, shrink), and each one requires a different data source to evaluate. Manual diagnosis at scale across hundreds of locations is impossible inside a weekly cycle. Anomaly detection alone is not enough. You need investigation on top of it.
Why have retail AI POCs failed so often?
The most common failure mode: the POC uses public data only, produces obvious-level insights, and the analytics team reports back that “my team can do this already.” Without store-level data and encoded business context, generic AI cannot clear the incremental-value bar. Agentic analytics versus augmented analytics is partly a question of who carries that context.
Do I have to replace Power BI or Tableau to do this?
No. The investigation layer sits on top. Your existing BI stays. Dashboards continue to show what happened. The AI layer runs the diagnostic work on top of the same data. Agentic analytics works differently from traditional BI dashboards, but it does not replace them.
How does AI actually capture tribal knowledge?
By literally recording it. The Scoop team sits with the senior operator, walks through how they read their reports, and records every observation. Those recordings become the operating context the AI uses to investigate. The output is a versioned, reviewable harness, not a black box. The investigation workflow explains the mechanism in detail.
Can this work if my data lives across systems that do not talk to each other?
Yes. AI is unusually good at integrating across messy, partial data sources. The integration work has gotten an order of magnitude cheaper than it was in the previous BI generation. The bottleneck is no longer integration; it is interpretation.
What is the actual business benefit of doing this for a retail chain?
The benefit is closing diagnostic latency: the gap between when a problem surfaces and when an actionable answer reaches the operator. Data analytics benefits retailers at multiple layers, but the single biggest unlock for multi-location chains is moving from “we will look into it” to “here is what happened and here are the three things to try this week.”






.webp)