How to Reduce Sample Size Pollution for Accurate A/B Test Results

Nyaima Smith-Taylor
By
August 7, 2020 ·

You spent hours strategizing your test. 

Your team creates a hypothesis.

You run the test and await the results. 

But you find your test failed. The results have been tainted. But how?

Don’t beat yourself up. There is a dirty little secret in the testing world called sample size pollution.

Pollution of your sample audience can unknowingly cause tests to be doomed before they even start. 

There is a long list of potential reasons tests fail, but one of the most frustrating is sample size pollution. 

This article will help you understand:

  • Why sample size pollution occurs.
  • How to know if your test is polluted. 
  • Steps to take to minimize sample size pollution from happening.

Let’s have a look… 

Sample Size 101 

Definition of Sample Size

You can use tools like Convert’s A/B test duration calculator, which offers a sample size calculator or CXL sample size calculator

Recommended Resource:

Most online calculators are simple to use. With Convert’s calculator, you only need to plug in three values: 

  • Existing Conversion Rate
  • Expected Improvement
  • Confidence Level

Example:

If the existing conversion rate is 3% and the expected improvement is 20% while testing two variations at a confidence level of 95%, you would need a sample size of 42,034 to get confident results. At 2,000 daily visitors to this test group, it would take 22 days according to our duration calculator.

How to Reduce Sample Size Pollution Convert Experiences

Determine Who Will Be In Your Sample

The easiest way to answer this question of “WHO?” or the segment, is by reviewing the demographics and sources of your current website visitors. Tap into the existing data for clues. Who are they? Where are they coming from?

Tools like Convert Experiments actually allow you to test using a specific segment of your website visitors and create custom audiences

Several factors can help you undercover the ‘who’:

  • Type of Traffic
    Do you get seasonal traffic? Do you expect an influx of visitors based on approaching holidays? Does your traffic numbers fluctuate depending on the day of the week?
  • Traffic Source
    Where does your traffic come from? People behave differently based on the source they enter your site from. For example, a visitor from LinkedIn may not interact with your site the same as someone coming from Facebook. 

    Examine Google Analytics to get an overview of visitor engagement based on Source. 
How to Reduce Sample Size Pollution
Source
  • New vs. Old
    Statistics show that returning visitors remain on your site longer than new visitors. Think about how this will affect your test. 

The goal of this consideration stage is to help you build representative samples.

The Encyclopedia of Survey Research Methods  defines representative samples as:

A representative sample is one that has strong external validity in relationship to the target population the sample is meant to represent. As such, the findings from the survey can be generalized with confidence to the population of interest.

To make sure you have a representative sample, Convert suggests running a test for at least one business cycle. This ensures your test has time to account for visitor variance that may happen within a cycle. 

What is Sample Size Pollution?

Now that you understand what sample size is you can explore the factors that can corrupt your sample size, and screw up your test. This is how sample size affects validity. Sample size factors that negatively affect test results are known as sample size pollution. 

Invespcro defines sample pollution as:

“…factors that invalidate your A/B test data by influencing the samples or data used while conducting your test.”

This problem is more common. Look at this complaint:

Sample Size Pollution

Biased Sample

In most cases, you want a random sampling, which means each visitor of your website has the same chance of seeing a particular variation before they are bucketed. Once placed in a bucket the user will see the same variant for the duration of the test. 

However, if you use an A/B testing tool that doesn’t perform randomization well, the randomization is not guaranteed and it can invalidate the test. 

A simple way to combat biased sampling is to use a good A/B testing tool like Convert that performs randomization and bucketing correctly. Start your testing off with an A/A test to check if the randomization works properly. 

You want to be aware of the potential of sample bias when you are considering the details of your test.        

Sources That Cause Sample Size Pollution

There are four common types of sample pollution are timing, device, browser, and cookie. 

Let’s look at each of them…

Timing

The length of your test influences the validity of your results. So it is no surprise “how long should I run my A/B test” is a common question.

CRO professionals have conflicting ideas on what’s an acceptable benchmark. Actually, your test variables should drive the proper length of your test.  

A straightforward solution may appear to be just allowing your test to run and run and run. But this too can cause issues. Added time means an increase in potential pollution from external factors.

You want to find the sweet spot. 

Another common mistake regarding the length of testing is stopping a test too early. This may not lead to sample size pollution, but it can negatively affect your test.  

The same is true if you stop the test when you reach statistical significance. For a valid test, it should also reach your calculated sample size for your desired MDE (Minimum Detectable Effect)

Along similar lines, never ever stop a variant of a running test. This will cause catastrophic pollution. You would be unable to compare the “stopped” variant against the “running at all time,” control. You would have no way to compare “apples to apples.” Never stop and later restart a variant in a test.

Don’t interrupt your tests until the data is consistent for the sample size amount.

Cookie Pollution

Cookies may cause the most insidious form of sample size pollution. 

Based on techopedia’s definition, cookies are:

A cookie is a text file that a Web browser stores on a user’s machine. Cookies are a way for Web applications to maintain application state. They are used by websites for authentication, storing website information/preferences, other browsing information and anything else that can help the Web browser while accessing Web servers. HTTP cookies are known by many different names, including browser cookies, Web cookies or HTTP cookies.

As marketers, cookies allow you to track your visitors’ behaviors on your site. 

The lifespan of cookies is volatile. Visitors can delete them at their slightest whim.

The longer your test runs, the more vulnerable you are to cookies being deleted – again leading to another form of sample size pollution. To mitigate this phenomenon, Convert advises customers to run tests for no more than 90 days.  

Recommended Resource:

Device Pollution

Visitors visit your site from multiple devices: mobile, laptops, tablets, desktops, and even smartwatches. 

Just think of your browsing behavior. You may spot something on your mobile device while at the gyms. Later in the day, you may revisit the website on your desktop computer. 

If this happens in the confines of your A/B test, it may appear that two different people visited your site when in fact it is the same person browsing from two different devices. 

Even more dangerous to your testing efforts is, this same person may see a different variant on each device. 

There is an inverse example of this. What happens when two people use the same device to visit your website? 

Imagine two brothers live in the same house. They share a desktop computer. Both are preparing for vacation and need to order new t-shirts and footwear. If an A/B test is running on the e-commerce site at the time of their visit, the data would show these two people as a single user, again, corrupting your sample size.   

Browser Pollution

When the average person gets online, they do not consider the ramification using different browsers to visit the same website will have on an A/B test. But going to the same website from one browser to another, like Safari and then Chrome can lead to similar sample size pollution that occurs with multi-devices.  

However, this specific form of pollution is rare, as most people will stick to using one preferred browser per device.

New Dangers

Browsers, device type, cookies, and length of tests are the most common sample size pollutants, but it looks like a new pollutant is entering the conversation. Industry professionals are complaining about Bots creating sample size pollution.

How to Reduce Sample Size Pollution

Thankfully at Convert, we have strong bot mitigation measures embedded within our tool so that will not be an issue.

Tips on How to Reduce Sample Size Pollution

Because Sample Size Pollution is a major issue, many companies have come up with creative fixes, like putting users into different buckets based on location. 

But such tactics can strip tests of “user randomness,” and can reduce your confidence that the test results are valid. 

Below are a few things you can do to reduce the chances of sample pollution:

  • Run test for separate devices. 
  • Run test for separate browsers.
  • Identify patterns. How has your data looked in the past?  It should be similar during testing – data consistency.
Recommended Resource:

Here are a few more things to consider…

Understand Variance

Variance and standard deviation go hand-in-hand with consistency. Essentially, they will tell you how far away from the average your numbers are. Low variance means your data is consistent with the average, which puts you at a lower risk of pollution.

You can do the math by hand yourself or just use a simple standard deviation calculator.

Be Aware of Potential Sampling Issues

There are inherent problems with A/B testing, including the possibility of sample size pollution. 

Knowledge of potential sample size issues empowers you to make better choices as you pick your test goals, create treatments, and run experiments.

Now You Can Beat Sample Pollution  

Good testing practices mean you start your projects with a full understanding of what can go wrong. 

Sample size pollution is a negative by-product that’s experienced when you run A/B tests. Your job is to reduce these negative effects as much as you can so you can have a successful test. 

Remember, mitigation happens before your test begins.

Use a robust tool like Convert that gives you the ability to segment tests, combat pesky bots, use good randomization techniques, all inside a simple platform that supports complex testing. 

Your experimentation strategy and the power of your software will make the difference in how well you minimize sample size pollution.

Now that you know this potential blind spot in your testing it can’t creep up on you.  

Mobile reading? Scan this QR code and take this blog with you, wherever you go.
Originally published August 07, 2020 - Updated November 06, 2024
Written By
Nyaima Smith-Taylor
Nyaima Smith-Taylor
Nyaima Smith-Taylor
Content Strategist & Creator at Convert
Edited By
Carmen Apostu
Carmen Apostu
Carmen Apostu
Head of Content at Convert

Start Your 15-Day Free Trial Right Now.
No Credit Card Required

You can always change your preferences later.
You're Almost Done.
Convert is committed to protecting your privacy.

Important. Please Read.

  • Check your inbox for the password to Convert’s trial account.
  • Log in using the link provided in that email.

This sign up flow is built for maximum security. You’re worth it!