2024 note: The ETL capability referenced in this blog post has been deprecated.
The cost of tracking every event in a sophisticated analytics tool can be prohibitively expensive. If that is not the case now, it will be as your company grows.
We are no strangers to this issue here at Amplitude. We often witness companies dealing with large volumes of data, some generating as many as 100 billion data points per month.
So how do you run large-scale analyses without throwing away your entire budget? The answer: behavioral sampling. At Amplitude, we are conscious of the prevalent need for behavioral sampling. We support ETL-level (extract, transform, load) sampling to reduce upfront cost and remove the need to regularly monitor data. Additionally, we implement a simple query-time sampling algorithm whose sole purpose is to deliver consistent statistical accuracy over time (v.s. reducing cost further).