Continuous AB testing and tweak of sites, emails and campaigns is becoming daily business for every marketer. Nevertheless it is way too easy to just say: let’s split! But then will the results matter? Can we confidently convince our management on the findings we made? I have collected a few steps to make during preparation of an A/B test.
A/B tests are generally tests when you split the examined population into 2, roughly equal parts. There can be more complex test designs put in place, but this is a topic for another post.
Before diving into the actual steps, I want to point out a statistical principle for evaluation. It is way too easy to say that which ever test group performs better on the key measure is definitely better. It is not the case. You need to make sure that the difference is significant.
Statistical inference is a broad topic, but in a nutshell if the results of the test groups are “not enough different” you can not be sure if the better performance is only due to some luck or truly due to the change you have made. Understanding this concept really determines the way you design an A/B test.
Just look at the 2 options below. In option A, can you confidently say to which distribution does the point indicated by the red arrow belong to? What about option B? I’m certain that you want your test results to resemble option B: you want a high level of confidence that if you apply the change you have tested you will receive better results in the future as well.
|Option A on differences of distribution – whereever you put the arrow, you have almost equal probability of the selected point belonging to any of the distribution. While there is difference in the probabilities it is difficult to say with confidence where the selection belongs to,|
|Option B on difference of distributions – whereever you put the arrow you can make a confident decision which distribution does the selected point belong to.|
So, to ensure you get to “Type B” you need to do the following, in case of a simple 50/50 A/B test.
1. Decide the one and only difference to test
Why only one? The reason is simple: if you do many changes at once you will not know which one have caused the increased performance. And well, you want to know that, cause later you expect to use it as Business As Usual to constantly drive performance. So limit yourself, make a conscious choice so that you have a clear, unbiased test.
2. Decide on the key revenue, lead, etc. metric as a criteria of success
From a statistical point this step is not a necessity, but I think you should apply this one as well. Measuring clicks and impressions is easy, but while they are part of your lead/revenue generation funnel you still want to make sure that the changes you make touch your bottom line. It is best to plan tests for impacting at least one business critical KPI and only revert to other options when you can not prove significance.
3. Decide the size of the population you want to test on
This decision is important from 2 aspects:
- Doing a test always involves the risk of actually driving lower than usual performance. Decide on the the “level of risk” you are willing to take by consciously sizing the test population
- On the other hand though, from a statistical perspective the bigger the better. In general you need to pick a big enough segment to generate sufficient data for analysis. This can be challenging when you have a small database or limited traffic to your site, but try to keep this in mind.
4. Make sure you can do a random split
After splitting your test population into 2, you want to be sure that they are very similar (homogenous) to each other. If not then anyone can say that your results are only better, because group A is different in certain things from group B. This is the same principle that Step 1: you want to be sure that your results are only influenced by the change you are testing.
Most marketing software is capable for such random split, so make sure you use these tools.
5. Create your hypothesis – and test if it is going to be significant
You need a target on how much better performance you expect from the change you are applying. Once you have this, you know have made all the steps to see if your hypothesis is right then you will have real results to analyze.
Crosscheck this hypothesis on an A/B test calculator. The linked tool is very simple, straightforward and reliable, so I encourage you to use it and tweak your test until you have validated your hypothesis.
6. Design the contingency plan
It can happen that your critical business KPI even after these steps will not give a confident result, so make sure you have other key metrics, which have a direct effect on the KPI, you are measuring during the test to use them as alternative to select a winner. It is best to use this only as a last option, but this is still better than trashing your entire test.
7. Give enough time for results to come in
Don’t expect results ASAP and do not close your test for the first sign of potential failure. You are already making an investment to learn how to improve your marketing, so make sure you give enough time for the data to accumulate. Seeing unexpected behavior is actually awesome during a test as this is why you created it, this is the way to discover, so give time to for things to unfold.
After you have done all these steps, you will have results you can confidently use. I hope these points will lead you to many successful tests that will drive an increased marketing performance in you business!