Dynamic Behavioral Sampling

Tracking every event is expensive. But, you still need to collect data for product analytics. The solution? Dynamic behavior sampling.

Inside Amplitude
August 13, 2018
Image of Jin Hao Wan
Jin Hao Wan
Software Developer
Dynamic Behavioral Sampling

2024 note: The ETL capability referenced in this blog post has been deprecated.

The cost of tracking every event in a sophisticated analytics tool can be prohibitively expensive. If that is not the case now, it will be as your company grows.

We are no strangers to this issue here at Amplitude. We often witness companies dealing with large volumes of data, some generating as many as 100 billion data points per month.

So how do you run large-scale analyses without throwing away your entire budget? The answer: behavioral sampling. At Amplitude, we are conscious of the prevalent need for behavioral sampling. We support ETL-level (extract, transform, load) sampling to reduce upfront cost and remove the need to regularly monitor data. Additionally, we implement a simple query-time sampling algorithm whose sole purpose is to deliver consistent statistical accuracy over time (v.s. reducing cost further).

About the Author
Image of Jin Hao Wan
Jin Hao Wan
Software Developer
Jin is on Amplitude's back-end engineering team, where he works on maintaining Amplitude's query engine and prototyping new features. He graduated from MIT with an MS in Computer Science.