Best practices you should follow when creating your A/B tests:
Test only one component/variable at a time:
If you are testing a new call-to-action and a new value proposition at the same time you wonβt be able to decouple the effects of your results. To avoid not being able to tell which variable is responsible for a certain outcome, you should only test one variable at a time.
Split your leads into equal batches when testing a new variable:
When running A/B tests, you should never focus all of your efforts on one single assumption. If you have a baseline approach then you should run your tests against that approach. If you don't have a baseline then you should test different assumptions simultaneously.
Have a significant sample size add 200 contacts per template:
Use the Reports page to keep track of the volume of contacts you add to each of your templates. In order to have conclusive results, it is important that you take into account that the sample size needs to be large enough before making a decision.
A/B test to optimize your sequences:
Since deliverability and results are negatively impacted by poor-performing sequences you should be constantly A/B testing various message types so that you find the optimal sequences & customer profiles so that you can address them going forward.
Only test 1 thing at a time - Never everything at once:
You should only test one variable at a time and you should always separate tests between open rate & reply rate (since they are heavily correlated and testing both things at once will mix the causing effects)
Example of an A/B test
Imagine a scenario where you want to test two different calls to action:
- A, Call to action 1: Are you available for a coffee?
- B, Call to action 2: Do you have 10 min for a quick call?
The metric we want to measure is the reply rate and our baseline is 35% (the average reply rate of your email campaigns). You define a minimum detectable effect of 30%, and a statistical significance of 95%, and from the calculator you get that the sample size needs to be 170. This means you have to send 170 emails for each call-to-action (A and B).
After testing both calls-to-action, imagine that this was the result of your test:
As you can see, the call-to-action 2 in scenario B had a higher reply rate than the call-to-action 1 in scenario A. Scenario B is 45% better than scenario A. Since we set our minimum detectable effect at 30% of our baseline (35%), it means that scenario B is clearly better than scenario A and won this A/B test.