On this page

Experiment Evaluation

Evaluation determines which variant, if any, a user receives given a flag configuration. Evaluation takes a user and a flag configuration as inputs and outputs a variant.

Diagram of Experiment evaluation steps from user and flag to variant

Pre-targeting

Pre-targeting steps can determine the evaluated variant before targeting segments run.

Activation

A flag is either active or inactive. Inactive flags never return a variant from evaluation.

Best practice

For on/off flags, Amplitude recommends setting the all users segment allocation to 100% or 0% rather than using the Activate/Deactivate flag button to control traffic. Use the Activate/Deactivate button to remove flags after a feature fully launches, or after you remove the flag's instrumentation.

Flag dependencies

A flag can define a dependency on another flag's evaluation. If the dependency isn't met, no variant returns. Otherwise, evaluation continues. Flag dependencies implement mutual exclusion groups and holdout groups.

For example, Flag-2 can define a dependency on Flag-1 evaluating to the variant on.

  • Flag-1 (50% on).
  • Flag-2 (50% control, 50% treatment).
    • Depends on Flag-1=on.

The dependency ensures Amplitude always evaluates Flag-1 before Flag-2. If Flag-1 evaluates to on, Amplitude fully evaluates Flag-2. If Flag-1 doesn't evaluate to a variant, or evaluates to a variant other than on, Flag-2 fails the dependency check and Amplitude assigns no variant. This prevents edge cases where the dependency checks get skipped or return undefined results. The dependency also keeps exposure events and audit trails consistent.

In this example, Amplitude assigns a Flag-2 variant to 50% of evaluated users.

Individual inclusions

Inclusions force-bucket specific users (identified by user ID or device ID) into a variant. Inclusions primarily support development.

For example, if you're developing a new multivariate feature and want to test each variant in your application, add your user or device ID to the Inclusions section of your experiment and refresh the application.

Sticky bucketing

Use sticky bucketing with care. Even with sticky bucketing disabled, consistent bucketing places users in the same variant when the user and targeting rules don't change. Changing targeting rules on an active flag with sticky bucketing enabled can cause a sample ratio mismatch (SRM), which can skew experiment results.

When you enable sticky bucketing, Amplitude always evaluates a user to the same previously bucketed variant, regardless of current targeting. Sticky bucketing doesn't apply if Amplitude hasn't yet bucketed the user into a variant.

Go to Sticky Bucketing for more information.

Targeting segments

Adding a target segment without defining any rules (where clauses) captures all users, even though the estimates show 0 users.

A flag or experiment can have 0-n targeting segments. Amplitude evaluates targeting segments from top to bottom. If a user matches the segment targeting rule, consistent bucketing determines which variant, if any, the user receives, based on the configured allocation percentage and variant distribution weights.

All users segment

The all users segment captures all users who don't match a targeting segment (if any). Consistent bucketing assigns users to a variant (or no variant) based on the configured allocation percentage and variant distribution weights.

Consistent bucketing

Amplitude Experiment's bucketing is consistent based on the user, bucketing key, bucketing salt, allocation percentage, and variant weights. Given the same inputs, the output remains constant.

The bucketing logic splits into two steps. Allocation bucketing determines whether the user receives a variant based on the allocation percentage. Variant bucketing runs only if allocation bucketing assigned the user. Both steps use the same consistent hash function in slightly different ways.

The bucketing salt makes experiment allocation statistically independent. Without the salt, any user Amplitude allocates to the treatment would get the treatment in every experiment.

Update the bucketing salt in two cases:

  1. To re-randomize users because of a bug or other issue in your experiment. Update the salt to a new random string.
  2. To make the evaluation of two experiments match. Update the salt to the same value in both projects.

Hashing

Amplitude Experiment's consistent bucketing uses the murmur3 consistent hashing algorithm on the value of the bucketing key for the given segment. If the bucketing salt or the bucketing value changes, the hash output changes and the user may variant jump.

text
murmur3_x86_32("bucketing_salt/bucketing_value")

Allocation bucketing

A user is allocated when the hash value modulo 100 is less than the allocation configured in the segment.

text
murmur3_x86_32("bucketing_salt/bucketing_value") % 100

Variant bucketing

After allocation, variant bucketing determines which variant the user receives. Amplitude associates variants with values between 0 and 42949672, based on their weights.

text
floor(murmur3_x86_32("bucketing_salt/bucketing_value") / 100)

For example, if variant A has weight 1 and variant B has weight 1, Amplitude associates variant A with values in the interval [0, 21474835] and variant B with values in the interval [21474836, 42949672].

Was this helpful?