How Location Testing Works
Location testing is fundamentally different from online A/B testing. Understanding why helps you design better experiments and interpret results correctly.
Why stores aren't websites
On a website, you can show Visitor A one price and Visitor B another price at the same moment. That's individual-level randomization—the gold standard of experimentation.
In a physical location, every customer gets the same experience. You can't show different prices to different people walking through the same store. The location is your unit of experimentation, not the individual.
This means:
- Your sample size is measured in locations, not visitors
- You need different statistical methods than traditional A/B testing
- Test design (which locations, how many, how long) matters much more
The before/after trap
The simplest approach—compare this month's revenue to last month's—is tempting but unreliable. Dozens of factors change between months: weather, holidays, competitor actions, economic conditions. Any observed change could be caused by your experiment or by something else entirely.
How ProofPod solves this
ProofPod uses Difference-in-Differences (DiD), a method developed by economists to isolate causal effects in exactly these situations.
The key insight: compare the change in your test locations to the change in your control locations over the same period.
| Before | After | Change | |
|---|---|---|---|
| Test locations | $100K | $112K | +12% |
| Control locations | $100K | $110K | +10% |
| Treatment effect | +2% |
Both groups went up (maybe it's summer), but test locations went up more. That 2% difference is your experiment's effect.
Fixed effects
ProofPod's model also controls for:
- Day-of-week effects — Saturdays are different from Tuesdays
- Seasonal patterns — week-of-year adjustments
- Location-level differences — some stores are just bigger
These "fixed effects" strip out predictable variation so the treatment effect estimate is cleaner.
Under the hood
The regression model is:
Y = β₀ + β₁·Treatment + β₂·Post + β₃·(Treatment×Post) + DoW_FE + WeekOfYear_FE + Store_FE + ε
β₃ is the treatment effect—the coefficient on the interaction of being a test location and being in the post period. Standard errors are clustered at the location level to account for within-location correlation over time.
What ProofPod automates
Without ProofPod, running this analysis requires a data scientist who can:
- Select balanced test and control groups
- Build and validate a synthetic control
- Run the regression with proper fixed effects and clustered standard errors
- Interpret the Bayesian posterior for decision-making
- Monitor guardrail metrics for side effects
ProofPod handles all of this automatically. You define the test, and ProofPod handles the statistics.