Experimentation: To Test or Not To Test?

A tale of two priorities

Priscilla Cheung
5 min readJan 25, 2021
Photo by Lachlan Donald on Unsplash

In recent years I’ve experienced marketing and product teams hungrier than ever to test their initiatives with experimentation — an analyst’s dream, right?

You would expect any analyst’s nirvana to be their stakeholders’ increasing appetite for being data-driven. In reality, this often leads to a crossroads between two conflicting priorities: driving a genuine test and learn culture, versus being asked to test everything under the blazing sun.

When it comes to driving true experimentation culture across a business, the key is to discern what is actually worth testing. You need to move the needle, not find the needle. That is, experimenting on marketing and product developments that drive high impact, rather than arbitrarily finding a needle in the haystack of ideas to test.

A common example of this is found in the practice of conversion rate optimisation (CRO). Marketers will want to test whether varying images, or the colour of a call-to-action button, will increase the conversion rate. More often than not, these are “find the needle” tests: there is no statistically significant difference between the variations in optimising conversions. What’s worse, these tests consume a great deal of resources— squandering valuable time and effort from your engineering and design teams, even analysts — for no added value.

To combat this, I’ve adapted two common analysis principles specifically for experimentation to optimise effectiveness in testing.

Analyses Before Hypotheses

Whilst the analysis of competing hypotheses (ACH) methodology is structured to reduce bias inherent even in a seasoned analyst’s critical thinking — one of the main critiques of the method is the rigour given to the initial hypotheses generation. If we are to be exhaustive in generating hypotheses (thereby limiting biases), there needs to be exploratory analysis even prior to their proposal.

Experiments should be founded on analyses, with recommendations for tests spearheaded by data-led hypotheses that move the needle to a notable difference.

Answering what would improve the quality of our online leads was initially met with several hypotheses — marketing channels, landing pages and content all anticipated to be drivers of lead quality.

Through exploratory analysis, I discovered leads who gave their contact details immediately before undergoing multiple steps of a form tended to be lower quality leads than those who completed all the steps. Leads like Bradley, who were hoping to start a business, happily complied with providing their details at the beginning of a form to source further information; whereas quality leads like Jane who urgently needed a loan to expand her business — were more willing to go through the whole form before submitting their details.

This analysis rendered an additional hypothesis to those initially ideated: that where in a form customers submitted their details (at the beginning versus the end of a form) was a driver of lead quality. The ensuing experiment resulted in a lower volume of leads when customers submitted their details at the end, but statistically higher lead quality.

Since the generation of hypotheses was founded on analyses, we were able to extend the hypotheses beyond the limitations of our existing perspectives and collective experience. We were then able to test for, and subsequently impact, behaviour that moved the needle and drove notable growth for the business.

Analysis Before Paralysis

Experimentation also inevitably warrants instances of when there is no statistical difference — with this, you will be at a crossroads of knowing whether there ever will be, or should be, a clear winner.

As analysts, you are often required to make decisions on what success looks like. Taking the stance of “moving the needle” means that you need to overcome analysis paralysis by being simultaneously data-driven and agile when determining the success of experiments.

We had been testing two alternate forms on our website for six weeks with no statistically significant difference between the variations. Our wait for a winning variation was beginning to impede the implementation of other experiments. Rather than let this paralyse our progress, I advocated for the experiment to be discontinued — and to keep the variation with the simpler flow.

The basis of this recommendation was by no means driven without being informed. In this case, no statistical difference indicated that there was not enough to evidence that either form was worse off than the other. With enough data, equivalence testing can confirm the null hypothesis beyond reasonable doubt — that there is no worthwhile difference between variations. In the absence of this at six weeks, I drove the recommendation to go ahead without a winning variation by using a significance estimator (built as part of our experimentation tracker*).

Using an automated ingestion of the experiment data, the significance estimator gauges the additional duration required for statistical difference between variations in an experiment. The estimation is on the grounds that all other inputs to the experiment remain constant (in this case the resulting conversion rates and sample sizes from the website traffic). Given the additional amount of weeks estimated, this gave way to moving on from the experiment without statistical difference.

Experiment Significance Estimator

Being agile in experimentation means letting the paralysis of needing “more time” or “more data” step aside. I’ve been guilty of wanting more time to perfect my analyses — but you are not always afforded this luxury of time. A thriving experimentation culture requires making timely decisions to continuously test and learn; yet this does not have to come at the expense of informed analyses. The right analysis can in fact support effective experimentation of feature development, with conversion rate optimisation at the core.

So — to test or not to test? Whether to pursue a culture of experimentation as an analytical mindset across your teams involves, in its own (Nolan’s) Inception-esque way — analysis.

’Tis not nobler in the mind to suffer from running experiments for the sake of doing experimentation.

‘Tis nobler to take arms with analyses — proactive analysis that will inform your hypotheses, as well as pre-emptive analysis to determine the success of your experiments. With this approach in mind, you may well make it to nirvana.

*Stay tuned — this experimentation tracker deserves its own post!

--

--

Priscilla Cheung

As seen by a data analyst, trying to make sense of it all.