Finalize your experiment's advanced settings
The final step in creating your experiment is to specify advanced settings. These settings include:
- Exposure settings: Settings for the exposure event that triggers before your audience receives your experiment.
- Stats Preferences: Statistical settings for experiment analysis.
- Bucketing options: Settings for bucketing and targeting your audience.
To set advanced settings
- In your experiment, scroll down to the Advanced section and click the edit icon.
- Set your preferences using the definitions below.
- Click Save and Close.
After you save your settings, test your experiment.
Exposure settings
Exposure settings are the configuration rules that define when and how Amplitude marks a user as exposed to an experiment or feature. These settings determine the logic that triggers an exposure event: whether a user counts as exposed the first time they qualify for an experiment, the first time they interact with a feature, or under custom criteria.
In your Experiment Design options, click Advanced and then click Exposure Settings to specify the settings you want. You can modify any of the following:
Exposure event
An exposure event is the moment when a user becomes eligible for an experiment variant or feature. Amplitude shows users the experiment variant regardless of whether they interact with it. This event serves as the anchor point for experiment analysis and ensures that Amplitude attributes downstream behaviors and outcomes to the correct variant. By logging exposure events, Experiment prevents biases such as double counting or misattribution. Exposure events also establish a consistent link between user actions and the experiment they were exposed to.
You can specify:
- Exposure Event: Choose which exposure event triggers the experiment. The default is the Amplitude Exposure event. Amplitude recommends leaving this setting as is, but you can specify a custom exposure event.
- Proxy Exposure Event: For Feature Experiments, a proxy exposure event is a placeholder used to estimate the duration of the experiment based on historical data of that event. The default is Any Active Event. You can specify any recorded event as the proxy.
- Custom Exposure Settings: Choose whether to further customize your exposure settings with:
- Attribution: Choose whether the exposure event activates only on the first instance of the user triggering it, or at any instance.
- Window: Choose whether the experiment triggers within a specific time period of the event.
Stats Preferences
Statistical preferences are the configurable settings that determine how Amplitude analyzes and displays experiment results. These preferences let teams choose parameters such as:
- CUPED toggled off
- Bonferroni Correction toggled on
- Custom Exposure Settings toggled off
- Test Type set to Sequential
- Confidence Level set to 95%
You can modify the Stats Preferences at any step of an experiment. They're most useful for the final analysis after the experiment ends.
This article continues directly from the Help Center article on learning from your experiment. If you haven't read that article, do so before continuing here.
CUPED
Controlled-experiment using pre-existing data, also known as CUPED, is an optional statistical technique that reduces variance in Amplitude Experiment. When you toggle CUPED on, Amplitude Experiment accounts for possible varying treatment effects across user segments. CUPED isn't the best choice for every experiment. For example, avoid CUPED when targeting only new users.
The random bucketing process can deliver unbalanced groups of users to each variant. This unbalance is pre-exposure bias, and CUPED addresses it. Without CUPED, pre-exposure bias persists in your experiment. This is why you may notice differences in the mean-per-variant when running the same experiment with and without CUPED.
For a more technical explanation, refer to this detailed blog post.
For more on how CUPED affects experiment results, refer to this blog.
Bonferroni Correction
Amplitude Experiment uses the Bonferroni correction to address potential problems with multiple hypothesis testing. Although a trusted statistical method, you may not want to use it in every case. One example is when you want to compare results with those generated by an internal system that doesn't support the Bonferroni method. In this case, and if you accept higher false positive rates, toggle the Bonferroni Correction off.
Statistical Method
Select which statistical method you want to use:
- Sequential testing: A statistical method that analyzes results continuously as data comes in, instead of only at a fixed sample size. This approach lets teams review experiment results continuously without inflating false positive risk. Because the method corrects for repeated looks at the data, it's useful for making faster decisions when effects are strong. It requires careful setup to avoid bias. Refer to Sequential Testing for more information.
- T-Testing: A traditional statistical test that compares the means of two groups, such as the control and treatment groups, to determine if differences are statistically significant. It assumes normally distributed data and fixed sample sizes. A t-test is simple and widely understood, but it's less flexible if you want to check results continuously or handle more complex outcome distributions. Refer to T-testing for more information.
- Bayesian: A statistical method that compares groups by calculating the probability that one variant outperforms another. Unlike traditional methods that rely on p-values and fixed hypothesis testing, Bayesian statistics provides direct probability estimates that align with how teams make decisions. Bayesian methods excel when you want continuous insight into experiment performance. They're valuable when you need to incorporate prior knowledge, make decisions with smaller sample sizes, or require probability statements that directly answer business questions like "How likely is this variant to succeed?" Refer to Bayesian Statistics for more information.
- Thompson Sampling: A Bayesian bandit approach that dynamically allocates more traffic to variants that appear to perform better. Instead of waiting until an experiment ends, Thompson Sampling balances exploration and exploitation in real time. This approach improves user experience by gradually sending more users to promising variants. It doesn't provide a classic p-value, but relies on posterior probabilities, making it a useful choice when you need adaptive decision-making.
Confidence Level
The confidence level measures how confident Experiment is that it generates the same results for the experiment across repeated rollouts. The default confidence level of 95% means that 5% of the time, you might interpret the results as statistically significant when they're not. Lowering your experiment's confidence level makes it more likely that your experiment reaches statistical significance, but the likelihood of a false positive goes up. Don't go below 80%, because the experiment's results may no longer be reliable.
Bucketing options
Specify how bucketing works in your experiment. You can specify:
- Evaluation Mode: Select whether Amplitude evaluates the experiment remotely on Amplitude servers or locally on your own machine. By default, Amplitude evaluates experiments remotely. Refer to Performance and Caching for more information.
- Sticky Bucketing: Specify whether to serve users the same variant after allocation, even if the rollout or targeting criteria change. When sticky bucketing is on, Amplitude doesn't re-bucket users when the targeting criteria change. The default is off. Refer to Sticky Bucketing for more information.
- Bucketing Salt: A string value used as part of the hashing process. The bucketing salt assigns users deterministically into experiment variants. By combining the bucketing salt with identifiers such as the user ID and experiment key, Experiment generates a random-looking but repeatable hash that places each user into the same variant across sessions. Changing the bucketing salt reshuffles assignments and re-randomizes users for that experiment.
Was this helpful?