Sequential testing for statistical inference

This article helps you:

  • Familiarize yourself with the statistical testing method used by Amplitude Experiment

Amplitude Experiment uses a sequential testing method of statistical inference. Sequential testing has several advantages over T-tests, another widely-used method, chief among them being that you don’t need to know how many observations you’ll need to achieve significance before you start the experiment.

Why is this important? With sequential testing, results are valid whenever you view them. That means you can decide to terminate an experiment early based on observations made to that point, and that the number of observations you’ll need to make an informed decision is, on average, much lower than the number you’d need when using a T-test or similar procedures. You can experiment more quickly, incorporating your new learnings into your product and escalating the pace of your experimentation program.

This article will explain the basics of sequential testing, how it fits into Amplitude Experiment, and how you can make it work for you.

Hypothesis testing in Amplitude Experiment

When you run an A/B test, Experiment conducts a hypothesis test using a randomized control trial, in which users are randomly assigned to either a treatment variant or the control. The control represents your product as it currently is, while each treatment includes a set of potential changes to your current baseline product. With a predetermined metric, Experiment compares the performance of these two populations using a test statistic. 

In a hypothesis test, you’re looking for performance differences between the control and your treatment variants. Amplitude Experiment tests the null hypothesis 

image1.png

where

image2.png

states there’s no difference between treatment’s mean and control’s mean.

For example, if you’re interested in measuring the conversion rate of a treatment variant, the null hypothesis posits that the conversion rates of your treatment variants and your control are the same.

The alternative hypothesis states that there is a difference between the treatment and control. Experiment’s statistical model uses sequential testing to look for any difference between treatments and control.

There are a number of different sequential testing options. Amplitude Experiment uses a family of sequential tests called mixture sequential probability ratio test (mSPRT). The weight function, H, is the mixing distribution. So we get the following mixture of likelihood ratios against the null hypothesis that

image3.png

:

image4.png

Currently, Amplitude only supports a comparison of arithmetic means between the treatment and control variants for uniques, average totals, and sum of property.

Note

Read more about sequential testing in this article on frequently asked questions.

Was this page helpful?

Thanks for your feedback!

May 21st, 2024

Need help? Contact Support

Visit Amplitude.com

Have a look at the Amplitude Blog

Learn more at Amplitude Academy

© 2024 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.