Data Winsorization: Method and Examples
Discover how winsorization can transform your web and product experiments. Learn to tame outliers, boost reliability, and uncover true insights in your data.
What is winsorization?
Winsorization is a way of handling the impact of extreme values or outliers.
The statistical method involves capping your data and replacing the most extreme points—those that don’t fit the rest of your data—with less extreme values from the same dataset.
Teams use this technique to keep the data as close to its original form as possible while minimizing the influence of anomalies in their results.
Say you have a list of numbers where a few are much higher or lower than the rest. Instead of removing these outliers completely, winsorization adjusts them to more moderate values.
This process limits how high or low values can go, so any data point above or below these limits gets “winsorized” or changed to match the limit. For example, if you winsorize the top and bottom 5% of your data, the highest 5% of values would be capped at the 95th percentile and the lowest 5% would be set to the 5th percentile.
This technique is instrumental in web and product experimentation, where unexpected spikes or drops in user behavior distort results. By winsorizing, you better understand typical user patterns without letting rare events dominate your analysis.
How does winsorization work?
Winsorization occurs after data collection and “smooths out” the most extreme fluctuations in your data to accurately represent and patterns.
The process involves a few key steps.
Set your boundaries
First, decide how much of your data to winsorize. Your web experiment might have collected a ton of information on lengths, purchase amounts, or page loads. However, winsorizing one of these variables at a time can provide a clearer picture of your findings.
Your boundary is typically expressed as a percentage, such as 1%, 5%, or 10%. These percentages determine your upper and lower limits (i.e., where you will cap your data).
The level you choose depends on how conservative you want to be. Selecting a lower percentage minimizes the influence of extreme outliers, while a higher percentage allows for more data variation but risks keeping those unusual values.
Find the outliers
Using your chosen boundary, identify the corresponding percentile values in your dataset. For a 5% winsorization, you’d find the 5th and 95th percentile values. For a 10% winsorization, the 10th and 90th.
Any data points below the lower bound or above the upper bound are tagged as outliers for adjustment.
Adjust extreme values
Instead of removing outliers, winsorization replaces them:
- Values below the lower are increased to match the lower cutoff point
- Values above the upper bound are decreased to match the upper cutoff point
For example, if your 5th percentile value is 10, and you have a data point of two, that point would be adjusted to 10. Similarly, a point at 200 that exceeds the 95th percentile value of 150 would be capped at 150.
Use for analysis
After replacing the extreme values, you now have a winsorized dataset. This new set has the same number of data points as the original but with the outliers tamed.
Double-check your work by comparing the summary statistics of your original and winsorized datasets. You will see reduced variance and a narrower range but similar central tendencies—the overall shape of your data should look the same.
Once verified, use this winsorized dataset in your experimental analysis. The data provides results that are less affected by extreme outliers.
Winsorization in web and product experimentation
Many product teams use winsorization to ensure experiments yield more reliable and meaningful outcomes.
Winsorized datasets often lead to better decision-making and a more accurate understanding of how changes impact user behavior. Addressing rare yet troublesome outliers gives you a clear picture of what’s happening in your tests.
Let’s examine some of the benefits of winsorization for product experiments and how the technique is used.
Deals with noisy data
Web and often contain exaggerated outliers. For example, a user might leave a browser tab open for days, warping the average session duration. Winsorization helps manage these extremities without discarding data entirely.
Improves statistical power
By reducing the influence of extreme values, winsorization increases the signal-to-noise ratio in experiments. This enhanced often leads to more precise estimates, helping detect true effects more reliably, even with smaller sample sizes.
Handles diver user behavior
User behavior in digital products can vary wildly—some users might make huge purchases while others make tiny ones. Winsorization enables you to account for this diversity without letting rarer occasions take over your analysis.
Provides robust metric definitions
Many teams use winsorized means as key . Instead of looking at the raw average revenue per user, they might winsorize the data and then take the average from this new dataset. This practice provides a more stable metric less likely to fluctuate due to unusual events.
Helps you compare experiments fairly
When running multiple experiments or comparing results over time, winsorization ensures you make fair comparisons. The chances that a single outlier in one experiment will lead to misleading conclusions are reduced.
Balances sensitivity and stability
In A/B tests, metrics need to detect real improvements without being overly sensitive to noise. Winsorization helps strike this balance, making metrics steadier without completely blunting their sensitivity to genuine changes.
Adapts to different product areas
You can apply different levels of winsorization depending on the part of the product you’re testing. For example, you might use a higher level (5%) for checkout flow experiments due to the potential for extreme outliers impacting revenue and a lower level (1%) for content recommendation tests where the impact of outliers is less critical.
Complements other techniques
Winsorization often works alongside other statistical methods in experimentation. It can be used in combination with variance reduction techniques or sequential testing approaches.
Examples of winsorization in data analysis
These real-world examples demonstrate how winsorization transforms messy data into actionable insights, helping you make more decisive decisions about your digital products.
Analyzing conversion rates
Suppose you run an site and want to analyze the for a new checkout process. Raw data from 1,000 user sessions shows conversion rates ranging from 0% to 100%.
However, a few users converted multiple times due to a rare bug, giving them conversion rates over 100%. These outliers distort your average, making the new process look better than it is.
You decide to winsorize the top 5% of conversion rates by calculating the 95th percentile (say it’s 15%) and capping all conversions above 15% to exactly 15%.
The outcome:
- Your analysis now reflects a more realistic view of typical user behavior
- The impact of the bug is reduced without removing those data points entirely
- You can more accurately compare this checkout process to others you’re considering
Product usage metrics
Imagine you’re analyzing the time spent on a new in your app. Out of 10,000 user sessions, the time spent ranges from 0 seconds to 12 hours.
While most sessions last between 30 seconds and 10 minutes, some users left the app open overnight, creating extreme outliers that skew the average time spent.
In this case, you winsorize the top and bottom 2.5% of your results (5% total). You determine that the 2.5th percentile is 10 seconds and the 97.5th percentile is 30 minutes.
These thresholds mean you set all values below 10 seconds to 10 seconds and all values above 30 minutes to 30 minutes.
Results:
- You preserve data about very short and very long sessions but limit their impact
- Your average time spent metric becomes more representative of typical usage
- You can more reliably detect changes in user engagement with the new feature
Limitations and considerations
Winsorization is a powerful tool, but it’s not without its complexities. It’s helpful to be aware of potential pitfalls and limitations and what other outlier treatments are available if you need them.
Potential loss of valuable information
Although Winsorization helps manage outliers, it can also mask important data.
Extreme values might represent actual, significant events, such as a viral social media campaign that succeeded only on one platform or with one demographic.
By capping these values, you might miss insights into exceptional user behavior. Sometimes, the outliers themselves could be the most interesting part of your data. The information may present opportunities to pivot your product and marketing strategies, tapping into previously unconsidered areas.
To mitigate potential loss, you should:
- Always keep your original unwinsorized data
- Analyze both the winsorized and raw data when possible
- Be cautious when interpreting results, especially if outliers are meaningful to your business
Choosing appropriate winsorization levels
Selecting the right level of winsorization (i.e., the percentage of top or bottom values you’ll consider “extreme” in a dataset) is crucial but can be challenging.
Too much winsorization can overly smooth your data, hiding natural patterns, whereas too little might not sufficiently address the influence of outliers.
The appropriate level can vary depending on your specific dataset and goals.
With , you may see more frequent extreme values that are less likely to be errors. A higher level of winsorization stabilizes the analysis by reducing the impact of these annoying fluctuations and highlighting more consistent trends.
If you’re analyzing user feedback from a product test, you might use a lower winsorization level to ensure you don’t miss out on important but less common user insights.
To pick the best level:
- Start with standard levels (such as 5% or 10%) and adjust based on your results
- Use domain knowledge to guide your choice
- Try winsorizing with multiple levels and compare the results
- Consider using adaptive techniques that adjust the level based on your data’s characteristics
Alternatives to winsorization
While useful, Winsorization isn’t always the best choice for all experimental data. Other outlier treatment methods include the following:
- Trimming: This method simply removes the outliers instead of capping them. It is more straightforward and completely eliminates extreme values, but it reduces the sample size and affects statistical power.
- Transformation: Applying mathematical transformations (e.g., log transformation) to your data. These formulas handle skewed distributions well but can make interpretation more complex.
- Other statistics: Using statistics that are naturally less sensitive to outliers, such as the median instead of the mean. While you won’t need to modify your data, these statistics may not be suitable for all analyses.
- Modeling: Using statistical models that account for outliers, such as robust regression. Modeling can provide a more nuanced handling of extreme values, but it is more complicated to implement and understand.
- Categorization: Grouping continuous data into categories can make analysis less vulnerable to exact values. However, you might lose some of the precision in your data.
When deciding whether to use winsorization or an alternative approach, consider:
- The nature of your data and outliers
- What you hope to achieve with your analysis
- The assumptions of your statistical methods
- How the method will affect your ability to explain and justify your results
The best strategy often involves trying multiple methods, comparing their outcomes, and understanding the trade-offs involved in each. Always be transparent about your methods and how they might impact your conclusions.
Turn statistical insights into product improvements
Winsorization is a must for teams that want to sharpen their metrics, increase the power of their experiments, and gain more actionable insights. The technique preserves data integrity while reducing the sway of unusual values.
can help you expand your experimentation efforts. The platform integrates seamlessly with your tech stack, offering solid features for designing, running, and analyzing A/B tests and other experiments.
- Create and manage custom metrics, including winsorized versions, to ensure you measure what matters most to your business.
- Watch your experiment results unfold in real time with automatic .
- Explore how different user segments respond to your experiments, uncovering insights that might be hidden in aggregate data.
- Connect your results with broader for a holistic view of user behavior.
- Share results, leave comments, and make decisions as a team, all within the platform.
Whether dealing with straightforward UI changes or more complex algorithmic experiments, Amplitude provides the infrastructure and insights you need to make data-driven decisions confidently.
Turn statistics into products your users will love. .