Frequently Asked Questions

How many locations do I need to run a test?

At minimum, you need 1 treatment and 1 control location. In practice, more locations give you tighter confidence intervals and faster results. ProofPod works best with 10+ locations, but meaningful tests are possible with as few as 5–6.

How long should I run a test?

It depends on your data volume and the effect size you're trying to detect. ProofPod recommends a minimum of 2–3 weeks for most tests. Larger effects can be detected faster; smaller effects need more time. The good news: ProofPod's Bayesian approach lets you stop early if the evidence is clear before the planned end date.

Can I run multiple tests at the same time?

Yes. ProofPod warns you if you try to assign a location to multiple active tests (overlap detection), but concurrent tests on different locations are fully supported.

Can I stop a test early?

Yes. ProofPod supports early stopping in both directions—if your experiment is clearly winning or clearly losing, you'll see a Scale or Kill recommendation before the planned end date. You can complete the test at any time from the test detail page.

What file format does the CSV upload accept?

Standard CSV files up to 50 MB. ProofPod auto-detects column mappings and shows confidence scores for each mapping. You'll need columns for: date, location identifier, and at least one metric value. See Connect Your Data.

Can ProofPod test at the customer level?

Not currently. ProofPod is designed for location-level testing—comparing stores, studios, or branches to each other. Individual customer randomization is a different problem that requires different methods. See How Location Testing Works for why.

Which integrations are supported?

ProofPod currently supports Mindbody (OAuth), Square (OAuth), and ClubReady (API key). You can also upload CSV files or connect warehouse data via synced events. See Integrations.

How is this different from Google Optimize or other A/B testing tools?

Traditional A/B testing tools randomize at the individual visitor level—great for websites, not possible for physical locations. ProofPod uses Difference-in-Differences and synthetic control methods specifically designed for location-level experiments where true randomization isn't possible.

What does "Ready to Call" mean on the dashboard?

A test is "Ready to Call" when ProofPod has enough data to make a confident recommendation—either Scale or Kill. It means you have actionable results waiting for your review.

Can I change the primary metric after creating a test?

No. The primary metric is locked when the test is created because it drives the entire experimental design—matching, power calculations, and analysis. If you need to measure a different metric, create a new test. You can add or remove guardrail metrics at any time.

What does a low R² score mean?

R² measures how well the synthetic control matches the test group historically. A low R² (below 0.7) means the control isn't tracking the test group well, which reduces confidence in results. This can happen when you have too few locations, locations with very different behaviors, or insufficient historical data. See Location Matching.

How many locations do I need to run a test?​

How long should I run a test?​

Can I run multiple tests at the same time?​

Can I stop a test early?​

What file format does the CSV upload accept?​

Can ProofPod test at the customer level?​

Which integrations are supported?​

How is this different from Google Optimize or other A/B testing tools?​

What does "Ready to Call" mean on the dashboard?​

Can I change the primary metric after creating a test?​

What does a low R² score mean?​