Incrementality

How to run your first incrementality experiment

Actionable results build trust, inform strategies, and unlock growth potential.

Like an A/B test but for your media campaigns, incrementality experiments are an incredibly powerful tool for understanding the impact of your marketing strategy. However, there are plenty of ways to get them wrong and waste time and money on tests that never give you a clear insight.

Here are the five steps every team needs to use to ensure they run powerful incrementality tests that create clarity.

1. Choose a clear hypothesis

The first thing to nail down is exactly what thing you’re testing and what you expect to happen. This sounds obvious, but if an explicit hypothesis isn’t written down beforehand, a team will inevitably start to move the goalposts once unexpected results come in.

We recommend a format along the lines of: “By doing X, we will cause Y, because Z”. For example: “By running Bus-back OOH, we will cause a significant uplift in traffic to the website, because we will be efficiently reaching our target market in a new way”. This makes it clear what metric you are holding yourself to, and also records what you think will happen in the market. This means that if the test does not drive a detectable uplft, at least you can be clear about which assumption you had about your market that may be wrong..

2. Choose the correct test markets

Choosing the right combination of markets in which to run the test is both an art and a science.

Depending on which methodology you are using, there may be different characteristics of your different geographic markets that make them more or less accurate when running experiments. Identifying any markets that are not suitable for tests is a crucial step in any experiment planning.

In broad terms, you want to choose a market which is big enough, and representative enough so that any change in customer behaviour is meaningful for the rest of the country. In Australia this often excludes markets like the Northern Territory or Tasmania.

Similarly, there may be strategic considerations (e.g shipping costs or competitor activity) which narrow down your options. Bimodal provides a simple report which shows the different possible tests in different combinations of markets to aid this step.

3. Plan for enough budget

Once you have a market, the next step is to set a budget that has a viable chance of delivering a real uplift (in that market). It is a common problem with A/B testing that advertisers may make changes that are too conservative to actually create a noticeable difference. Geo-holdout tests have the same issue.

Many marketers have not-so-fond memories of an A/B test to change a heading colour from green to blue which simply never reaches statistical significance and is left running for months on end. It is essential to understand the minimum level of uplift you are expecting and how sensitive your test will be when running experiments.

Exactly how small an uplift your test will be able to detect is subject to a wide range of factors. Bimodal provides a report that uses historical regional data to understand what this minimum detectable uplift is. As a rule of thumb, we would recommend targeting an uplift of around 10%, not less than 5%.

So, when choosing a budget, ideally you will simply spend enough money such that the absolute lowest ROI you could accept from the campaign will still result in the minimum uplift you are targeting. For example, with a brand campaign, this might be a 1x ROI over the test period. For PPC it might be your target rolled up ROI.

4. Plan for enough (but not too much) campaign duration

Once you have a budget, you also need to understand how long the campaign should run. This is related to the minimum uplift (longer campaigns should detect smaller uplifts). However, you also need to consider both the ad-stock decay of your channels (the length of time that spend in channel influences the market) and the time-to-conversion of your customers. If you are running brand focused media and you know that it takes about 4 weeks from first interaction to purchase, you shouldn’t run a test for less than 4 weeks.

Importantly, once you have a campaign duration that is long enough to capture the effect of your campaign, you have to hold yourself to that. It can be very tempting to simply “let the test run” when you don’t get the result you need. The issue with this is that teams will often. simply choose to run a test forever instead of accepting a negative result.

5. Validate and interpret the campaign results

Once the campaign has run and you are able to calculate the uplift in the targeted region, you should consider two separate metrics. Firstly, the total uplift detected is obviously important and we want it to be a big number. However a p-value, or other metric of uncertainty in the results is also crucial.

It is easy to celebrate prematurely when you see a large uplift from a test. However if the test result is not statistically significant, you cannot be confident that your campaign was actually the cause of any observed increase or decrease.

This also means that post-experiment you should validate that other channels remained steady and that the media was run in the correct markets for the correct duration. If not, the test will be skewed and potentially unusable.

If you receive a low-confidence result from a valid experiment, this can still be extremely valuable. If you budgeted the campaign generously, then you know that that spend did not meet your minimum expectations. So even if we cannot quantify the exact uplift that the campaign generated, we can satisfy ourselves by saying that it was not “enough” and move on to the next idea.

Once you have a result (successful or not), you can use this information to increase or decrease your media spend in the target channels. Valid incrementality tests can also be used to calibrate and improve Marketing Mix Models

Blogs on Marketing Science