Platform

AI

AI Agents
Sense, decide, and act faster than ever before
AI Visibility
See how your brand shows up in AI search
AI Feedback
Distill what your customers say they want
Amplitude MCP
Insights from the comfort of your favorite AI tool

Insights

Product Analytics
Understand the full user journey
Marketing Analytics
Get the metrics you need with one line of code
Session Replay
Visualize sessions based on events in your product
Heatmaps
Visualize clicks, scrolls, and engagement

Action

Guides and Surveys
Guide your users and collect feedback
Feature Experimentation
Innovate with personalized product experiences
Web Experimentation
Drive conversion with A/B testing powered by data
Feature Management
Build fast, target easily, and learn as you ship
Activation
Unite data across teams

Data

Warehouse-native Amplitude
Unlock insights from your data warehouse
Data Governance
Complete data you can trust
Security & Privacy
Keep your data secure and compliant
Integrations
Connect Amplitude to hundreds of partners
Solutions
Solutions that drive business results
Deliver customer value and drive business outcomes
Amplitude Solutions →

Industry

Financial Services
Personalize the banking experience
B2B
Maximize product adoption
Media
Identify impactful content
Healthcare
Simplify the digital healthcare experience
Ecommerce
Optimize for transactions

Use Case

Acquisition
Get users hooked from day one
Retention
Understand your customers like no one else
Monetization
Turn behavior into business

Team

Product
Fuel faster growth
Data
Make trusted data accessible
Engineering
Ship faster, learn more
Marketing
Build customers for life
Executive
Power decisions, shape the future

Size

Startups
Free analytics tools for startups
Enterprise
Advanced analytics for scaling businesses
Resources

Learn

Blog
Thought leadership from industry experts
Resource Library
Expertise to guide your growth
Compare
See how we stack up against the competition
Glossary
Learn about analytics, product, and technical terms
Explore Hub
Detailed guides on product and web analytics

Connect

Community
Connect with peers in product analytics
Events
Register for live or virtual events
Customers
Discover why customers love Amplitude
Partners
Accelerate business value through our ecosystem

Support & Services

Customer Help Center
All support resources in one place: policies, customer portal, and request forms
Developer Hub
Integrate and instrument Amplitude
Academy & Training
Become an Amplitude pro
Professional Services
Drive business success with expert guidance and support
Product Updates
See what's new from Amplitude

Tools

Benchmarks
Understand how your product compares
Templates
Kickstart your analysis with custom dashboard templates
Tracking Guides
Learn how to track events and metrics with Amplitude
Maturity Model
Learn more about our digital experience maturity model
Pricing
LoginContact salesGet started

AI

AI AgentsAI VisibilityAI FeedbackAmplitude MCP

Insights

Product AnalyticsMarketing AnalyticsSession ReplayHeatmaps

Action

Guides and SurveysFeature ExperimentationWeb ExperimentationFeature ManagementActivation

Data

Warehouse-native AmplitudeData GovernanceSecurity & PrivacyIntegrations
Amplitude Solutions →

Industry

Financial ServicesB2BMediaHealthcareEcommerce

Use Case

AcquisitionRetentionMonetization

Team

ProductDataEngineeringMarketingExecutive

Size

StartupsEnterprise

Learn

BlogResource LibraryCompareGlossaryExplore Hub

Connect

CommunityEventsCustomersPartners

Support & Services

Customer Help CenterDeveloper HubAcademy & TrainingProfessional ServicesProduct Updates

Tools

BenchmarksTemplatesTracking GuidesMaturity Model
LoginSign Up

Using Twyman’s Law to Avoid Red Herrings in Product Analytics

Being aware of Twyman's Law and its implications is bound to improve the way you analyze, experiment with, and improve your product.
Insights

May 17, 2017

13 min read

Tai Rattigan

Tai Rattigan

Former Head of Partnerships, Amplitude

Using Twyman’s Law to Avoid Red Herrings in Product Analytics

The world of analytics is full of red herrings and false paths.

red-herring

When there’s so much data to work with, it’s easy to get careless and assume that the numbers right under your nose are always telling you the truth:

  1. A new piece of code breaks your homepage, but after looking at your analytics you see that users are spending more time on it than ever—and therefore must be more engaged.
  2. You collect the birthday of every user that signs up for your service. What you find shocks you—nearly 5% of all your users were born on January 1st.
  3. Your web analytics team comes to you with a surprising revelation. Your e-commerce business—usually active 24 hours a day, 365 days a week—shows no sales, no visitors, no nothing for an entire hour between 2am and 3am on March 12th, 2017.

Each one of these brief anecdotes is an illustration of what’s known as Twyman’s Law, which simply states that scientific results that appear extreme or out of the ordinary are usually not—they’re usually wrong.

We all know the feeling of seeing data that’s too good to be true (or too bad to be true). For the cases above, there are perfectly innocuous explanations for all of them:

  1. Visitors are spending more time on your homepage, but only because it’s broken and it’s taking longer for them to do what they want to do.
  2. The fastest way to fill out your mandatory birthday collection form is to simply pick January 1st from the dropdown menu.
  3. In spring, many countries turn clocks one hour forward in a tradition known as Daylight Savings Time—hence the lack of any sales (or any activity, for that matter) during an hour that simply doesn’t exist.

In startup analytics, where the feedback cycles are short and the pressure to launch great, it’s especially easy to fall prey to Twyman’s Law and make sloppy statistical mistakes.

Twyman’s Law and its audience research origins

Tony Twyman is regarded as one of the pioneers of audience research. In a career that spanned from the 1950’s to the early years of the 21st century, Twyman contributed to the technical and methodological development of the field for both TV and radio measurement in the UK.

One of his most famous contributions to the field is the law named after him, which states:

Any piece of data or evidence that looks interesting or unusual is probably wrong!

The practical implication of this for anyone in product management or analytics is that every time a test bears results that are unexpected and cannot be explained by an obvious factor, there’s a high probability that they are wrong.

Another academic, Prof. Richard De Vaux of Williams College, has further defined two corollaries to Twyman’s Law, that apply to anyone working on developing software products:

  • “If it’s perfect, it’s wrong.”
  • “If it isn’t wrong, you probably knew it already.”

Beyond the theory and the rules of statistics, we should have a look at what Twyman’s Law looks like in practice. Two examples from the team working on Microsoft’s search engine Bing give us ample evidence.

Twyman’s Law as user experience trap

There are many ways in which the Twyman’s Law can manifest itself in product analytics. The fast-paced and demanding environment, in which product managers operate, makes them especially susceptible to the law.

Related Reading: 5 Cognitive Biases Ruining Your Growth

The team at search engine Bing are used to running thousands of experiments in which even a small change in performance can have an impact on revenue measured in millions of dollars. Obtaining reliable results to those test is, therefore, extremely important to their work. In a paper authored by a member of the team, they outline a number of unexpected outcomes to such tests they’ve produced, which can be attributed to Twyman’s Law.

Lower quality of search results led to better performance in key metrics

A bug in one of those experiments run on the search engine led users to be shown very poor results in the so-called “10 blue links” — the main results shown to users in a search. This led to an increase in queries per user by 10% and average revenue per user by 30%.

Investigating deeper, the team found out that users had to make more searches until they found what they were looking for and as a result clicked on more paid results, leading to a higher overall revenue user.

If Microsoft were prioritizing only metrics like queries per user and average revenue, they might have reached the conclusion that deliberately lowering the quality of search results is the way to go. Obviously, such a tactic would work only in the short term. As users find themselves constantly annoyed by the results their searches yield, they would be more likely to convert to an alternative search engine.

In this case, Bing’s team understood that the relevant aim in this case that aligns with their long-term goals is to lower the average number of queries per user.

Small change in code leading to a sharp rise in search result clicks

Another example comes from an experiment in which an extra piece of JavaScript code was added to search result pages so that the destination was recorded before the browser was allowed to proceed to it.

This resulted in a spike in the number of users who successfully clicked on search result pages.

In this case, the differences came down to technological aspects of the JavaScript code:

“Chrome, Firefox, and Safari are aggressive about terminating requests on navigation away from the current page and a non-negligible percentage of click beacons never make it to the server. This is especially true for the Safari browser, where losses are sometimes over 50%. Adding even a small delay gives the beacon more time, and hence more click request beacons reach the server. We have seen multiple experiments where added delays made an experiment look better artificially.”

Clearly, the success in this case was not due to better performance, but because of an instrumental difference. Because the team was aware of the technical aspects of how browsers work and they knew there was something wrong the minute they saw a sharp increase attributable to non-IE browsers, they were able to quickly catch the issue.

Many people who engage in analytics and testing don’t have the same level of understanding and could easily fall victim to Twyman’s Law. That doesn’t mean they have to get expert understanding in computer networking to be able to avoid it — a basic understanding of one of the main concepts of statistics would suffice.

Statistical Significance 101: How to run better experiments and avoid Twyman’s Law

The way to evade the curse of Twyman’s Law is by grounding your experiments in the rules of statistics and making sure each result is statistically significant before you take it as it is. Here are several ways in which you can achieve this.

Pick the right metrics

Going back to the first example from Bing’s experience, we saw that having a solid understanding of what moves their business forward was essential.

Choosing metrics that represent progress towards your business goals, rather than specific “feature” metrics should be your first concern. Feature metrics are especially easy to improve, but they rarely lead to significant improvement in overall business results.

As the authors of “**Seven Rules of Thumb for Web Site Experimenters**” point out:

“[…] When building a feature, it is easy to significantly increase clicks to that feature (a feature metric) by highlighting it, or making it larger, but improving the overall page clickthrough-rate, or the overall experience is what really matters. Many times all the feature is doing is shifting clicks around and cannibalizing other areas of the page.”

Moreover, when you’re measuring the effect of an experiment or change that affects only a segment of your audience, the metrics that you use should be diluted by the size of that segment:

“That 10% improvement to a 1% segment has an overall impact of approximately 0.1% (approximate because if the segment metrics are different than the average, the impact will be different).”

Figuring out the right set of metrics and developing a sound framework to track them goes a long way in preventing costly mistakes down the road.

Limit the impact of false positives

With iterative improvement, teams who move quickly to build, test, and ship run a significant risk of getting a false positive — a favorable change in an observed metric, that’s the result of chance rather than real improvement.

As the number of iterations tested and treatments in each experiment rise, so does the probability of getting a false positive. For example, a test with two iterations stands only a 2.5% chance of getting statistical significance, while a test with six iterations of 5 treatments each has a >50% chance of getting positive lift backed by statistics.

To counter the effect of this, you can use two mechanisms that will make your testing more robust:

  • Use lower p-value in order to require a higher level of statistical significance before you accept the result of a test. If you’re currently using a p-value of 0.05, that means there’s a 5% chance of error. Adjusting your p-value to 0.01 will mean you’ll be correct in 99% of cases.
  • Replicate test results: While testing multiple variants of a single feature — or set of features — is always a good idea, running a final experiment is when the funnel of options narrows down is optimal. Doing this provides an additional level of scrutiny, which should save you from falling victim to Twyman’s Law.

Avoid statistical interactions

When you are testing multiple elements at the same time, you run the risk of causing a statistical interaction. It happens when the combined result of two changes does not equal the sum of the change each would cause on its own.

Interactions are a problem because the main assumption when running tests is that each is done in isolation as we can treat its result solely as the product of the changes made for each treatment. When you have an interaction, you tend to get skewed results for all experiments involved.

In organizations that run multiple tests daily, interactions are also dangerous because they can trigger unexpected bugs that cause bad user experience.

Preventing statistical interactions altogether is hard and even impossible for large organizations that run hundreds of tests simultaneously. The best way to avoid them from happening is by adding constraints when running tests: for example, making sure that one subject — i.e. site visitor — does not participate in two tests at the same time.

The persistent and recklessly critical quest for truth

Product people can be naturally inclined to take positive test results at face value and move forward without putting too much thought into validating their findings. A startup isn’t a laboratory—time is always the #1 limiting factor on your survival.

Being aware of Twyman’s Law and its implications, however, is bound to improve the way you analyze, experiment with, and improve your product.

Karl Popper wrote that science, at its heart, is about the “persistent and recklessly critical quest for truth.”

Similarly, the key to mastering experimental analytics is not to identify all the possible pitfalls and traps out there, but to get a solid understanding of the foundations of running and analyzing experiments. Once you have that, avoiding Twyman’s Law is simply about staying diligent—and always checking twice on any numbers that look especially out of the ordinary.

About the author
Tai Rattigan

Tai Rattigan

Former Head of Partnerships, Amplitude

More from Tai

Tai formerly worked with our Solutions and Technology partners at Amplitude to maintain our best-in-class network. Coming to Amplitude from the digital optimization space, Tai is excited about seeing companies discover insights and transform their businesses with Amplitude.

More from Tai
Topics
Platform
  • Product Analytics
  • Feature Experimentation
  • Feature Management
  • Web Analytics
  • Web Experimentation
  • Session Replay
  • Activation
  • Guides and Surveys
  • AI Agents
  • AI Visibility
  • AI Feedback
  • Amplitude MCP
Compare us
  • Adobe
  • Google Analytics
  • Mixpanel
  • Heap
  • Optimizely
  • Fullstory
  • Pendo
Resources
  • Resource Library
  • Blog
  • Product Updates
  • Amp Champs
  • Amplitude Academy
  • Events
  • Glossary
Partners & Support
  • Contact Us
  • Customer Help Center
  • Community
  • Developer Docs
  • Find a Partner
  • Become an affiliate
Company
  • About Us
  • Careers
  • Press & News
  • Investor Relations
  • Diversity, Equity & Inclusion
Terms of ServicePrivacy NoticeAcceptable Use PolicyLegal
EnglishJapanese (日本語)Korean (한국어)Español (Spain)Português (Brasil)Português (Portugal)FrançaisDeutsch
© 2025 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.
Blog
InsightsProductCompanyCustomers
Topics

101

AI

APJ

Acquisition

Adobe Analytics

Amplify

Amplitude Academy

Amplitude Activation

Amplitude Analytics

Amplitude Audiences

Amplitude Community

Amplitude Feature Experimentation

Amplitude Guides and Surveys

Amplitude Heatmaps

Amplitude Made Easy

Amplitude Session Replay

Amplitude Web Experimentation

Amplitude on Amplitude

Analytics

B2B SaaS

Behavioral Analytics

Benchmarks

Churn Analysis

Cohort Analysis

Collaboration

Consolidation

Conversion

Customer Experience

Customer Lifetime Value

DEI

Data

Data Governance

Data Management

Data Tables

Digital Experience Maturity

Digital Native

Digital Transformer

EMEA

Ecommerce

Employee Resource Group

Engagement

Event Tracking

Experimentation

Feature Adoption

Financial Services

Funnel Analysis

Getting Started

Google Analytics

Growth

Healthcare

How I Amplitude

Implementation

Integration

LATAM

Life at Amplitude

MCP

Machine Learning

Marketing Analytics

Media and Entertainment

Metrics

Modern Data Series

Monetization

Next Gen Builders

North Star Metric

Partnerships

Personalization

Pioneer Awards

Privacy

Product 50

Product Analytics

Product Design

Product Management

Product Releases

Product Strategy

Product-Led Growth

Recap

Retention

Startup

Tech Stack

The Ampys

Warehouse-native Amplitude

Recommended Reading

article card image
Read 
Product
Getting Started: Product Analytics Isn’t Just for Analysts

Dec 5, 2025

5 min read

article card image
Read 
Insights
Vibe Check Part 3: When Vibe Marketing Goes Off the Rails

Dec 4, 2025

8 min read

article card image
Read 
Customers
How CAFU Tripled Engagement and Boosted Conversions 20%+

Dec 4, 2025

8 min read

article card image
Read 
Customers
The Future is Data-Driven: Introducing the Winners of the Ampy Awards 2025

Dec 2, 2025

6 min read

Explore Related Content

Integration
Using Behavioral Analytics for Growth with the Amplitude App on HubSpot

Jun 17, 2024

10 min read

Personalization
Identity Resolution: The Secret to a 360-Degree Customer View

Feb 16, 2024

10 min read

Product
Inside Warehouse-native Amplitude: A Technical Deep Dive

Jun 27, 2023

15 min read

Guide
5 Proven Strategies to Boost Customer Engagement

Jul 12, 2023

Video
Designing High-Impact Experiments

May 13, 2024

Startup
9 Direct-to-consumer Marketing Tactics to Accelerate Ecommerce Growth

Feb 20, 2024

10 min read

Growth
Leveraging Analytics to Achieve Product-Market Fit

Jul 20, 2023

10 min read

Product
iFood Serves Up 54% More Checkouts with Error Message Makeover

Oct 7, 2024

9 min read