Confidence Level
Contributor
What is Confidence Level?
Confidence level is the probability that the result of your A/B test reflects a real difference, not just random variation. It’s expressed as a percentage. Most often 90%, 95%, or 99%.
A 95% confidence level means that if you repeated the same test 100 times, about 95 of those tests would capture the true effect of the change. It doesn’t mean there’s a 95% chance you made the right decision. It means the method you used is expected to give a reliable result 95 times out of 100.
Why Confidence Level Matters in A/B Testing
Confidence level helps you understand whether your test results are trustworthy. The higher the confidence level, the lower the chance that a result is due to random noise.
Most teams use 95% as a default. It strikes a balance between speed and reliability. Choosing a 99% confidence level requires more users and a longer test. It reduces false positives (also known as Type I error) but delays decisions.
Going lower, like 90%, may be acceptable for exploratory tests or low-risk changes.
Confidence Level vs Confidence Interval
| Aspect | Confidence Level |
Confidence Interval |
|||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| What it means | Reliability of your test method | Range of plausible values for your result | |||||||||||||||
| Expressed as | Percentage (e.g., 90%, 95%, 99%) | A numeric range (e.g., +4% ±2%) | |||||||||||||||
| Applies to | The statistical method used | A specific estimate in your test result | |||||||||||||||
| Tells you | How often your test process would yield valid results | Where the true effect might lie, based on your data | |||||||||||||||
| Relationship | Higher confidence level → wider interval | Wider intervals indicate more uncertainty |
What a 95% Confidence Level Actually Means
A 95% confidence level means this:
If you ran the same test 100 times under the same conditions, the testing method would correctly identify a real effect about 95 times.
That’s it.
It does not mean:
- There’s a 95% chance your current result is correct
- There’s a 95% chance variant B is better than A
- You’re 95% confident in this specific outcome
“The higher the confidence level, the more confident you can be that your results are trustworthy. In general, experimenters should aim to achieve a confidence level of 95% or higher.
However, it’s essential to note [that] the highest confidence level you can get is 99.99%+. You can never be absolutely 100% sure your data is entirely accurate.
A 95% confidence level means the difference between versions is real and not just statistical noise or due to random chance.
However, it’s important to realize [that] this metric does NOT mean there’s a 95% chance of making the right decision based on the test results. It only tells you that you can be 95% sure the results reported are reliable.”
Deborah O’Malley, M.Sc., Founder of GuessTheTest
How to Set the Right Confidence Level for Your Tests
Choose your confidence level based on risk, not rules.
- 90% may be fine for quick or low-stakes experiments
- 95% is the standard default
- 99% is better for irreversible or high-risk decisions
The higher the confidence level:
- The harder it is to find a winner
- The more traffic you need
- The longer your test will take
If your confidence level remains low even after the full test duration, you may need to run the test longer or reconsider your sample size.