Make Sure Your Analytics is Scalable From Day 1

The ideal analytics platform lets you answer questions when they arise.

Best Practices
April 12, 2016
Image of Archana Madhavan
Archana Madhavan
Instructional Designer
Make Sure Your Analytics is Scalable From Day 1

When we talk about data accessibility, most of the time it’s in the “all for one, one for all” context. Everyone who wants access to the data to draw insights from them, can do so easily.

But we can also look at data accessibility in another way–accessibility in the sense of collecting all the data you want and having your whole company running analyses in real-time.

Leading growth product managers agree on one thing: set up your analytics as soon as you have users. But if you’re at an early-stage company, using up valuable engineering resources to build a robust in-house solution that allows easy data access for all may not be feasible. For the short-term, you can make do with a minimum viable analytics stack, gathering data through Google Analytics reports or by running scripts on a MySQL database.

But what happens at that inflection point when your company experiences hyper-growth? When your active user count begins to grow exponentially?

More likely than not, you’ll reach the limits of your homegrown analytics infrastructure. Things will start to break and then you’re spending all your time putting out fires instead of building your product.

To create a data-informed culture early on in your company, it’s well worth the time and investment into a self-service analytics solution that allows data access for all. That doesn’t just mean a platform with an intuitive UI that everyone can use, it means looking at a solution from the lens of scalability.

Scalable Analytics Means Better Data Access

A high-growth company can outgrow a “scrappy” analytics infrastructure fast.

When Asana hit the limits of their data infrastructure, queries they ran on MySQL took upwards of 6 hours and their daily data processing latency was barely under 24 hours. Their infrastructure was sufficient for the first couple years since they launched, but then it became clear their current system was not scaling with their growth.

Increasingly long query times, inability to look at their data in real-time, and a growing number of people blocked by having to write code meant longer time to insight. In the tech world where speed is the key to success in the industry, Asana’s old analytics infrastructure was holding them back.

Many high-growth companies use third-party analytics vendors–like Amplitude (yes, that’s us), Mixpanel, Localytics, and Google Analytics Premium, to name a few–to make analytics quicker, easier, and more accessible to the whole company. Few vendors think about data accessibility from the perspective of scaling.

When you only have a few active users, a sophisticated self-service analytics tools might seem too extravagant for your budget. The majority of these tools profit off of pricing based on data volume, which means a bootstrapped startup might be forced to scrimp on instrumenting all the things they want or forced to sample their query results. In both cases, you’re at a disadvantage.

If your query results are sampled “in order to reduce latency,” you only get a gross snapshot of your data, resulting in inaccurate metrics (and consequently poor product decisions) and missing the chance to capitalize on the behaviors of your most active users. On the other hand, if you have to pick and choose what to instrument, or only look at high-level KPIs, you’ll miss key insights into how your users are using your product.

Analytics that cannot scale does not give you the data access or data insights you need to make smart product decisions.

An analytics platform that scales effectively can scale in three dimensions: data volume, price, and number of analytics users, without sacrificing performance. The latter point basically means the platform can support more and more users running simultaneous queries on a rapidly growing volume of data, in a reasonable time and without crashing. If you’re at a rapidly growing company, you’ll be dealing with such a scenario.

Scaling Analytics at Amplitude

Amplitude is one of the few analytics solutions uniquely positioned to address all three dimensions of scalability.

For Steven Jian, CTO and cofounder of Rocket Games Rocket Games, scalability was one of the main requirements for their analytics solution.

Rocket Games creates casino games for mobile platforms. As they gained traction, it was clear they were outgrowing their current mobile analytics vendors. Queries were slow and it would take 1-2 days for data to reach their dashboards. For a team that releases updates several times a week, on over 40 separate games and 4 different platforms, this was unacceptable.

To understand how their users were interacting with their games, Rocket Games had to have access to real-time analytics, so they could get insights quickly and continue improving their product.

“We want to see the data flow through very quickly,” said Steven. “We don’t want to wait an entire day waiting to see what’s happening in our game right now.”

Scalability CTT 1

When Rocket Games started using Amplitude, they were tracking about 200 million events each month. Now they’re tracking more than 3 billion–with no decreases in speed and performance.

“The performance of the UI has stayed the same since day one — it’s almost magical. We’re throwing so much more data at Amplitude now, and it just works,” said Steven.

Zouhair Belkoura, CEO of KeepSafe, had a similar experience. The KeepSafe app allows users to privately save photos and files behind a locked PIN on their mobile devices. KeepSafe outgrew their previous analytics tool when they hit 10 million daily active users. They discovered that Amplitude could not only scale with their data volume (and cost), but also provided the insights they needed quickly.

Within a year, KeepSafe was tracking 6 billion events and counting. They no longer had to worry about what to track.

The ideal analytics platform lets you answer questions when they arise. “You don’t have to define your questions at the time of instrumentation. That is really how it should work. You collect and leverage it to answer questions that come up later,” said Zouhair.

Amplitude’s scalable platform effectively made Rocket Games and KeepSafe’s data more “accessible” — in the sense that they were able to track, analyze, and get more insights from their data than they otherwise could have.

Next Level Scalability

Last year, we talked in depth about Amplitude’s Wave Architecture and how it allows us to tackle the challenge of scaling robustly and cost-effectively, without sacrificing performance. This year, our infrastructure is getting an even more formidable upgrade, one that will make your queries even faster at larger data volumes. We can’t say too much just yet but stay tuned and subscribe to our blog for more!

CTA
About the Author
Image of Archana Madhavan
Archana Madhavan
Instructional Designer
Archana is an Instructional Designer on the Customer Education team at Amplitude. She develops educational content and courses to help Amplitude users better analyze their customer data to build better products.
More Best Practices
Image of Christopher Selden
Christopher Selden
Principal Product Manager, Data Connections, Amplitude
Image of Nate Franklin
Nate Franklin
Director, Product Marketing, Amplitude
Image of Akhil Prakash
Akhil Prakash
Senior Machine Learning Scientist, Amplitude