Author Archives: Jeffrey Wang

About Jeffrey Wang

As Amplitude's Chief Architect, Jeffrey is responsible for supporting and scaling Amplitude's architecture and is deeply involved in the company’s technical product roadmap. Before Amplitude, he built distributed log systems at SumoLogic and Palantir.

Analytics that Doesn’t Compromise on Data Integrity

At Amplitude, we believe first and foremost in providing the best product analytics. We find the right solution for our users and then figure out how to make it happen on the engineering side. This is in contrast to other analytics services or in-house analytics teams that make compromises on data integrity because it’s easier from a technical perspective. But one of the top reasons that people don’t use analytics to make decisions is that they don’t trust the data. And for good reason — those of us building analytics have historically chosen to sacrifice accuracy when it makes systems easier to build. However, we believe that the role of analytics is changing, and that analytics can and needs to be better than that.

Read the full post on our Engineering Blog to learn about 4 technical problems we solved to ensure data integrity >>

Nova: The Architecture for Understanding User Behavior

Amplitude has grown significantly both as a product and in data volume since our last blog post on the architecture, and we’ve had to rethink quite a few things since then (a good problem to have!). About six months ago, we realized that old Wave architecture was not going to be effective long-term, and started planning for the next iteration. As we continued to push the boundary of behavioral analytics, we gained more understanding of what we needed from a data storage and query perspective in order to continue advancing the product.

We had two main goals for the new system: (1) the ability to perform complex behavioral analyses (e.g. Compass and Pathfinder), and (2) cost-effective scalability. After extensive research, we decided to build an in-house column store that is designed specifically for behavioral analytics. We call the resulting system Nova, and we’re excited to share the thought process around how we got here and some of the key design decisions we made.

Continue reading

Scaling Analytics at Amplitude

Update: In May 2016 we updated our analytics architecture to NOVA. Read the article here.

Laying the foundation with pre-aggregation and lambda architecture

Three weeks ago, we announced that we are giving away a compelling list of analytics features for free for up to 10 million events per month. That’s an order of magnitude more data than any comparable service, and we’re hoping it enables many more companies to start thinking about how they can leverage behavioral analytics to improve their products. How can we scale so efficiently? It comes down to understanding the nature of analytics queries and engineering the system for specific usage patterns. We’re excited to share what we learned while scaling to hundreds of billions of events and two of the key design choices of our system: pre-aggregation and lambda architecture. Continue reading