Category Archives: Engineering


Analytics that Doesn’t Compromise on Data Integrity

At Amplitude, we believe first and foremost in providing the best product analytics. We find the right solution for our users and then figure out how to make it happen on the engineering side. This is in contrast to other analytics services or in-house analytics teams that make compromises on data integrity because it’s easier from a technical perspective. But one of the top reasons that people don’t use analytics to make decisions is that they don’t trust the data. And for good reason — those of us building analytics have historically chosen to sacrifice accuracy when it makes systems easier to build. However, we believe that the role of analytics is changing, and that analytics can and needs to be better than that.

Read the full post on our Engineering Blog to learn about 4 technical problems we solved to ensure data integrity >>

Hackathon - Blog

How Hackathons Can Drive Velocity and Disrupt Your Product Roadmap

Hackathons are a time honored tradition of many tech companies. They’re a time for everyone to break free from their day to day work and innovate. Here at Amplitude, hackathons have been a great way of bypassing the traditional processes of product development to disrupt our own roadmap, as well as an opportunity to foster cross functional teamwork and relationships. We’ve taken to doing a hackathon at the start of every quarter, and are coming hot off of our third with some fresh ideas and ambitious projects.

Check out the highlights from our July Hackathon on the Engineering Blog >>

Eng - Z$tandard

New Engineering Post: Reducing Kafka costs with Z$tandard

One of the major challenges that technology startups will face is scaling up effectively and efficiently. As your user base doubles or triples, how do you ensure that your services still run smoothly and deliver the same user experience? How do you maintain performance while being cost-efficient? Here at Amplitude, our customers have tracked more events in the past year than in the first 3 years of our company combined. As we and our customers grow, we need to continue providing the same if not better service across our platform. Previously, we explained how Nova, our distributed query engine, searches through billions of events and answers 95% of queries in less than 3 seconds. In this blog post, we will focus on our data processing pipeline that ingests and prepares event data for Nova, and explain how we stay cost-effective while our event volume multiplies.

Check out the full post on our Engineering Blog >>

Distributed Real-time Data Store with Flexible Deduplication

In the world of “big data”, businesses that can quickly discover and act upon insights from their users’ events have a decisive advantage. It is no longer sufficient for analytics systems to solely rely on daily batch processing. This is why our new column store, Nova, continues to use a lambda architecture. In addition to a batch layer, this architecture also has a real-time layer that processes event data as they come in, and the real-time layer only needs to maintain the last day’s events. In a previous post, we focused on the batch layer of Nova. Designing the real-time layer to support incremental updates for a column store creates a different set of requirements and challenges. We will discuss our approach in this post.

lambda architecture data flow

Flow of data through a generic lambda architecture (source)

Continue reading


Slack + Amplitude: Making it easier for teams to share and discuss user insights

Why & how we built a Slack app for Amplitude

If your team is anything like ours, you’re in Slack…a lot. At Amplitude, almost all internal communication happens in Slack, and it’s even our preferred method for talking to some of our customers.

Which is why when we were thinking about how to help teams share and discuss insights from user data, Slack was the first thing that popped into our minds. In fact, lots of our customers told us that they were taking screenshots of Amplitude graphs and pasting them into Slack for further discussion — not exactly an ideal workflow. Continue reading