Skip to main content

How Location Testing Works

Location testing is fundamentally different from online A/B testing. Understanding why helps you design better experiments and interpret results correctly.

Why stores aren't websites

On a website, you can show Visitor A one price and Visitor B another price at the same moment. That's individual-level randomization—the gold standard of experimentation.

In a physical location, every customer gets the same experience. You can't show different prices to different people walking through the same store. The location is your unit of experimentation, not the individual.

This means:

  • Your sample size is measured in locations, not visitors
  • You need different statistical methods than traditional A/B testing
  • Test design (which locations, how many, how long) matters much more

The before/after trap

The simplest approach—compare this month's revenue to last month's—is tempting but unreliable. Dozens of factors change between months: weather, holidays, competitor actions, economic conditions. Any observed change could be caused by your experiment or by something else entirely.

How ProofPod solves this

ProofPod uses Difference-in-Differences (DiD), a method developed by economists to isolate causal effects in exactly these situations.

The key insight: compare the change in your test locations to the change in your control locations over the same period.

BeforeAfterChange
Test locations$100K$112K+12%
Control locations$100K$110K+10%
Treatment effect+2%

Both groups went up (maybe it's summer), but test locations went up more. That 2% difference is your experiment's effect.

Fixed effects

ProofPod's model also controls for:

  • Day-of-week effects — Saturdays are different from Tuesdays
  • Seasonal patterns — week-of-year adjustments
  • Location-level differences — some stores are just bigger

These "fixed effects" strip out predictable variation so the treatment effect estimate is cleaner.

Under the hood

The regression model is:

Y = β₀ + β₁·Treatment + β₂·Post + β₃·(Treatment×Post) + DoW_FE + WeekOfYear_FE + Store_FE + ε

β₃ is the treatment effect—the coefficient on the interaction of being a test location and being in the post period. Standard errors are clustered at the location level to account for within-location correlation over time.

What ProofPod automates

Without ProofPod, running this analysis requires a data scientist who can:

  1. Select balanced test and control groups
  2. Build and validate a synthetic control
  3. Run the regression with proper fixed effects and clustered standard errors
  4. Interpret the Bayesian posterior for decision-making
  5. Monitor guardrail metrics for side effects

ProofPod handles all of this automatically. You define the test, and ProofPod handles the statistics.