How Amplitude Worked with AWS to Unlock AI-Driven Insights for All

AWS names Amplitude an Artificial Intelligence for Data Analytics (AIDA) solution, augmenting analytics with machine learning to bring personalization at scale to the enterprise.

Inside Amplitude

November 29, 2021

Mustafa Paksoy

Director of Engineering, Amplitude

When companies think about digital personalization, Amazon, Netflix and Spotify come to mind. These organizations are famous for optimizing the customer experience so individuals feel like each interaction was designed just for them. Unfortunately, most companies encounter significant barriers to entry when attempting to do the same thing. Building 1:1 digital experiences is costly, time-consuming and resource-heavy, requiring sophisticated identity resolution and machine learning capabilities.

That’s why earlier this year we launched , our personalization product that gives all digital companies the power to produce personalized experiences at scale. And now, we’re taking that a step further. Today at the Global Partner Summit at , the Amazon Web Services (AWS) team announced Amplitude’s designation as an AI for data analytics (AIDA) solution. As an AIDA solution, Amplitude leverages AWS artificial intelligence (AI) and machine learning (ML) services to ultimately take the complexity out of AI-based insights.

Amplitude utilizes AWS services to augment analytics from the Amplitude platform with ML so that everyone in the enterprise, regardless of technical background, can access the power of predictive analytics. To better understand how Amplitude, as an AIDA solution, puts personalization in the hands of every individual, we first need to understand how that ML model works and what those machine learning-driven insights look like.

Developing a Machine Learning Model

Our work to deliver automated machine learning (AutoML) at scale began with our in 2020. From there, our engineering team worked to integrate Clearbrain’s technology into Amplitude’s proprietary data ecosystem, , and automate the four stages of a classic ML workflow for our customers. These stages include project definition, data transformation, training, and deployment.

While this workflow typically requires weeks of collaboration between engineering and data science teams to build a single model, Nova’s AutoML system accelerates this process to minutes, addressing each key challenge through its proprietary architecture.

Our team built the Nova AutoML system to automate the stages of a traditional ML workflow, including connecting a real-time distributed datastore to a managed model deployment service. While there are , one is , which we leverage to facilitate the training portion of our AutoML system. Amazon SageMaker’s Jobs feature enabled us to run Tensorflow and scikit-learn transforms at high levels of computation during the model training components, while managing autoscaling and batch processing of the inference stages. The out-of-the-box monitoring capabilities also allowed us to quickly iterate on testing and parameterization of our modeling infrastructure.

Personalization Reimagined

In our post from this past spring, we discussed Amplitude’s , making use of clustering techniques to find similarities between users. Our new model takes personalization a step further and uses deep learning to provide an even more customized experience based on aggregated user data. Now in Audiences, we make use of neural networks to provide easily specified, off-the-shelf product recommendations that may be retrieved on demand through our Audiences User Profile API.

Training the Model

Our recommendation system depends on a collection of statistics (features) about a specific user for input. We process the collected data about our customers’ users to create these collected features, in a process called feature engineering.

We first collect a selection of unique events within the customer stream, based on how strongly they correlate with users performing the goal event; we then select the top N features based on correlation. We collect counts of these events as performed by each user over the past training time period, and the first and last occurrence of them, by time, as a relative value. These are our event features. The second type of features we use are property features, which contain information specific to each user such as tech platform and locale. These input features are collected for each user and provided as training input for the model.

Our recommendation model uses a multilayer neural network, a common setup for deep learning. The output of our model is a vector of scores for each of the items available to recommend, where the score represents the probability that the user converts after viewing this item. Our final ranking is the items sorted by this probability score, in descending order, so the most likely item to bring a user to conversion is at the top. We implement this neural network using Keras, a machine learning library contained within Tensorflow, and the network is trained using instances on Amazon SageMaker.

From ML Modeling to ML-Driven Insights

At Amplitude, we believe that the best teams are those that build their strategy around product data, and with Audiences self-serve personalization capabilities, we empower any individual to intelligently adapt digital products and campaigns to every user based on their behavior. Nova AutoML advances the Amplitude ecosystem with predictive insights at scale, and by predicting the future behavior of their end users, Amplitude customers raise the quality of their decision-making and turn insights into action faster.

Now, as an AWS AIDA partner, we can further our work with AWS to provide all enterprise organizations—and every member of those organizations—democratized access to machine learning insights. With faster insight-to-action, customers can better know, grow and keep their end users.

. Find more information about AWS AIDA .

About the Author

Mustafa Paksoy

Director of Engineering, Amplitude

Insights

Action

Data

Insights

Action

Data

Industry

Use Case

Team

Size