Amplitude's A/B testing features rely on standard statistical techniques to determine its statistical significance. This article covers some frequently asked questions about those calculations.
The information in this article only applies to A/B tests in a funnels chart. It does not apply to the Experiment Results chart, or to end-to-end experimentation in Amplitude Experiment.
Mean of variant (A): Mean of baseline (B): = number of conversions = sample size
How does Amplitude calculate improvement over baseline?
Why are unique conversions considered in the calculations but not totals?
If you want to use the T-test to analyze your end-to-end Experiment or Experiment Results chart data, follow the steps in this Help Center article. Interpreting stat sig results For both sequential testing and the T-test, Amplitude uses a false positive rate of 5% to judge results, and it only looks at the best-performing variant. By default, Amplitude uses a 5% false positive rate, the threshold for significance is (1- p value) > 95%. You can set a different false positive rate in Amplitude Experiment. You cannot change the false positive rate in the Funnel Analysis chart. To help reduce false positives, Amplitude sets a minimum sample size before it declares significance. Currently, this minimum is set to 30 samples, five conversions, and five non-conversions, for each variant. Tests that do not meet these minimums are automatically considered not statistically significant. When a test has reached statistical significance, you will see this green text: Otherwise, you will see the following red text:
How does Amplitude calculate statistical significance?
Note
Thanks for your feedback!
June 27th, 2024
Need help? Contact Support
Visit Amplitude.com
Have a look at the Amplitude Blog
Learn more at Amplitude Academy
© 2024 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.