This article helps you:
Find definitions of important terms for working with Amplitude Experiment
Term | Definition |
Allocation | The percent or number of targeted users you want to get this variant. |
Assignment Event | Another name for Enrollment event. |
Audience | A group of users targeted for the experiment. This audience is usually split evenly into “control” and “variant” groups. |
Baseline conversion rate | The current rate of your primary success metrics prior to this experiment. |
Bonferroni correction | A statistical technique used to counteract the multiple comparisons problem (also known as multiplicity or the look-elsewhere effect). |
Confidence interval | A range of plausible values that contains the parameter of interest. In our case, the true parameter we’re trying to estimate is the difference in means between the treatment and control/baseline. For example: if the confidence level is set to 95 and we ran the same experiment 100 times, the confidence interval–in each run–would contain the true parameter at least 95 times. |
Confidence / significance level | The probability of a false positive. For example, if you have a 95% confidence level, there is a 5% chance of detecting a change to your success metric when there was really no change. |
Guardrail metric | A metric you want to ensure doesn't suffer at the expense of increasing your success metrics. For example, if you drive users to a free trial of your business product, trials of your consumer product could be a counter metric. If business trials go up, consumer trials go down. You want to make sure there's a net positive effect. |
CUPED | Controlled-experiment using pre-existing data, also known as CUPED, is an optional statistical technique meant to reduce variance in experimentation. |
Exposure Event | The event that indicates when a user has actually seen a change based on a experiment. |
Hypothesis | An assumption of the methods you could take to solve or ease the problem statement and why. |
p-value | The probability of observing data as extreme as what you saw or more assuming that there is no difference between treatment and control. |
Payload | Variables attached to a variant, that can be used to remote change flags and experiments without a code change. |
Primary success metric | The main metric you hope to move by running this experiment. Should ideally drive both customer and business success. |
Problem statement | An explanation of the internal business or user problem you are trying to solve. |
Run time | Based on the sample size needed per variant and your traffic levels, how long your experiment takes to run. |
Sample size | The number of users/amount of traffic you need in each of your experimental variants to soundly detect statistical significance. |
Secondary success metric | An additional metric you hope/expect to move with this experiment. |
Sequential testing | A statistical analysis where the sample size isn't fixed in advance, allowing you to: conduct an A/B test, peek at your results, and conclude them without inflating your false positives. |
Statistical power | The probability that you detect a change to your success metric when there is a change to be detected. |
T-test | A statistical analysis that's a comparison of means amongst two populations of data to decide if the difference is statistically significant. |
Target lift / minimum detectable effect (MDE) | The percentage change you expect to drive on your primary success metric as a result of this experience. |
Type 1 error | Incorrectly classifying that there is a statistically significant difference between treatment and control, when there isn't. |
Type 2 error | Incorrectly classifying that there is no difference between treatment and control, when there is. |
Thanks for your feedback!
August 28th, 2024
Need help? Contact Support
Visit Amplitude.com
Have a look at the Amplitude Blog
Learn more at Amplitude Academy
© 2024 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.