Platform

AI

Amplitude AI
Analytics that never stops working
AI Agents
Sense, decide, and act faster than ever before
AI Visibility
See how your brand shows up in AI search
AI Feedback
Distill what your customers say they want
Amplitude MCP
Insights from the comfort of your favorite AI tool

Insights

Product Analytics
Understand the full user journey
Marketing Analytics
Get the metrics you need with one line of code
Session Replay
Visualize sessions based on events in your product
Heatmaps
Visualize clicks, scrolls, and engagement

Action

Guides and Surveys
Guide your users and collect feedback
Feature Experimentation
Innovate with personalized product experiences
Web Experimentation
Drive conversion with A/B testing powered by data
Feature Management
Build fast, target easily, and learn as you ship
Activation
Unite data across teams

Data

Data Governance
Complete data you can trust
Integrations
Connect Amplitude to hundreds of partners
Security & Privacy
Keep your data secure and compliant
Warehouse-native Amplitude
Unlock insights from your data warehouse
Solutions
Solutions that drive business results
Deliver customer value and drive business outcomes
Amplitude Solutions →

Industry

Financial Services
Personalize the banking experience
B2B
Maximize product adoption
Media
Identify impactful content
Healthcare
Simplify the digital healthcare experience
Ecommerce
Optimize for transactions

Use Case

Acquisition
Get users hooked from day one
Retention
Understand your customers like no one else
Monetization
Turn behavior into business

Team

Product
Fuel faster growth
Data
Make trusted data accessible
Engineering
Ship faster, learn more
Marketing
Build customers for life
Executive
Power decisions, shape the future

Size

Startups
Free analytics tools for startups
Enterprise
Advanced analytics for scaling businesses
Resources

Learn

Blog
Thought leadership from industry experts
Resource Library
Expertise to guide your growth
Compare
See how we stack up against the competition
Glossary
Learn about analytics, product, and technical terms
Explore Hub
Detailed guides on product and web analytics

Connect

Community
Connect with peers in product analytics
Events
Register for live or virtual events
Customers
Discover why customers love Amplitude
Partners
Accelerate business value through our ecosystem

Support & Services

Customer Help Center
All support resources in one place: policies, customer portal, and request forms
Developer Hub
Integrate and instrument Amplitude
Academy & Training
Become an Amplitude pro
Professional Services
Drive business success with expert guidance and support
Product Updates
See what's new from Amplitude

Tools

Benchmarks
Understand how your product compares
Prompt Library
Prompts for Agents to get started
Templates
Kickstart your analysis with custom dashboard templates
Tracking Guides
Learn how to track events and metrics with Amplitude
Maturity Model
Learn more about our digital experience maturity model
Pricing
LoginContact salesGet started

AI

Amplitude AIAI AgentsAI VisibilityAI FeedbackAmplitude MCP

Insights

Product AnalyticsMarketing AnalyticsSession ReplayHeatmaps

Action

Guides and SurveysFeature ExperimentationWeb ExperimentationFeature ManagementActivation

Data

Data GovernanceIntegrationsSecurity & PrivacyWarehouse-native Amplitude
Amplitude Solutions →

Industry

Financial ServicesB2BMediaHealthcareEcommerce

Use Case

AcquisitionRetentionMonetization

Team

ProductDataEngineeringMarketingExecutive

Size

StartupsEnterprise

Learn

BlogResource LibraryCompareGlossaryExplore Hub

Connect

CommunityEventsCustomersPartners

Support & Services

Customer Help CenterDeveloper HubAcademy & TrainingProfessional ServicesProduct Updates

Tools

BenchmarksPrompt LibraryTemplatesTracking GuidesMaturity Model
LoginSign Up

4 Learnings in Our Journey from EC2-Classic to VPC

If you haven't made the switch to virtual private cloud (VPC) yet, the improved security and pricing makes it well worth the effort. To get you started, here's what we learned in our year-long migration to VPC.
Insights

Oct 17, 2018

14 min read

Julien Dubeau

Julien Dubeau

Senior Engineer, Amplitude

4 Learnings in Our Journey from EC2-Classic to VPC

In 2014, Amazon made VPC (virtual private cloud) the standard deployment environment for all new applications created on Amazon Web Services. If you started your AWS account before then, and you haven’t migrated, it’s likely that your application still uses EC2-Classic (elastic compute cloud).

There are a lot of great security, price, and architectural reasons to migrate your existing EC2-Classic architecture over to VPC, if you haven’t yet.

The problem is, if you run 1,400 separate EC2 instances and 50+ services like Amplitude does—many of which have either strict uptime requirements or the potential for data corruption—migrating to VPC can be quite the experience.

It took us over a year to complete the full end-to-end migration, involving 30,000 lines of code in our devops repo alone, but we did it and also managed to build in various other upgrades to boot. Here are 4 of our most significant learnings that we picked up along the way.

Screen Shot 2018-10-11 at 11.55.47 AM

1. Don’t use a weighted DNS round-robin to progressively roll out a new version of a customer-facing service

When you’re rolling out a new version of a customer facing service in AWS, there’s a bit of faux DNS magic that is tempting to use but ultimately dangerous: the weighted round robin.

Here’s how it works: each of the resource sets that a DNS record points to receives traffic based on a weight factor you specify. That means that if you’re rolling out a new service, changing those weight factors based on cluster can seem like a magically convenient way of doing things—you can send just 10% of your traffic to a new spritely test cluster, keeping the remaining 90% going to your curmudgeonly battle-hardened veteran cluster, for example.

It can also appear great when the spritely test cluster decides it hates you and your 500 rate shoots up by 10x. When that happens, you can simply set the weight to 0% on the test cluster, and after a brief TTL period, traffic seems to magically stop flowing to it.

But, even if you have an appropriate TTL on your DNS record, there are actually ISPs that will completely ignore your TTLs (and even in some cases individual customers who have their own overly aggressive caching). In the worst case scenario, 100% of your customers might happen to use an ISP that ignores your TTL, resulting in 100% of your customers continuing to hit the test cluster with some percentage of their requests for some amount of time that is completely outside your control.

Screen Shot 2018-10-11 at 11.57.43 AM

This pitfall may actually be fine for some folks – it all depends on the context. However, given that there are easier and more reliable alternatives available, I’d suggest just generally avoiding weighted round robin as a progressive rollout mechanism.

Engineering tip:

There are certainly less finicky ways of progressively rolling out a test cluster that will give you a lot more control. One of the simplest options in AWS is to use two Auto Scaling groups, assuming you’re dealing with a service that is appropriate for Auto Scaling. All you have to do is manage the scaling of the two groups independently. For example, you can arrange the first cluster to be fixed at 10 nodes, while the second cluster scales between 50 and 100 nodes. Then, once you’re more comfortable with the new cluster, you can tweak the scaling.

Whatever option you choose, ideally it will allow you to control the traffic flow as reliably as possible, otherwise you may find yourself in a nightmare situation when you discover that you’re dealing with an ISP whose company slogan is “we might connect you to something, but only if we feel like it.”

2. Failing to plan is planning to fail, but trying to plan everything is failing before you’ve even started.

For a migration of this size, an iterative, gung-ho approach actually gives you a higher likelihood of project completion than trying to have a perfect plan. This is because it allows you to experiment early and often, and it enables you to communicate incremental progress to the rest of the company much more effectively.

For anyone who is passionate about their work, it’s easy to get sucked up into trying to plan every detail about how things are going to work. Spoiler alert: you’ll probably miss most of the things that actually matter. See, once upon a time at another company, I attempted a project that was very similar to the Amplitude VPC migration. We spent a year planning out the migration, trying to anticipate every step we’d likely encounter along the way, and never really got around to making concrete progress.

At Amplitude, we didn’t hesitate and I believe that was absolutely crucial to the eventual completion of the VPC migration. By the end of the first week of the project, I had a production service in the new VPC, and it grew organically from there. We began by building out the core VPC networking architecture, then we figured out how to make our configuration management systems behave differently depending on whether a server was in EC2-classic or VPC, and we made some first-pass decisions on how to organize our terraform code. After that, we were pretty much off to the races.

Screen Shot 2018-10-11 at 11.59.23 AM

Engineering tip:

Nobody in the world can effectively manage a software project with a deadline that is one year in the future. However, any somewhat decent engineer can manage a project with a deadline that is two weeks away. So, if you have a long-term project, break it up into a series of projects that each add incremental value.

This micro-project approach tends to work far, far better from a project management perspective, as well as from a business perspective. Every new feature you add is in some sense an experiment and a dialogue with your customers. You add the feature, you receive feedback from your customers, and you adjust as needed based on your learnings. So, if your process is full of experiments that take an entire year just to setup, then by the time you start learning anything from your customers, there’s probably a competitor who has already learned all of those same lessons and is miles ahead of you.

In the context of the VPC migration at Amplitude, this process of breaking down the year long project into smaller ones essentially amounted to having an idea of the value add for each individual service as it was moved to VPC. For example, when we moved our frontend load balancer to VPC, we knew that this would give us an easy way to leverage HTTP/2 because amazon’s VPC-only ALB service supports it out of the box. This is independently valuable – even if we had aborted the rest of the VPC project at that point, we still would have maintained the additional value. As another example, when we moved our main query engine to VPC, we also switched the instance type to make use of the NVMe instance stores that come with i3 instances, which gave us some nice performance gains. If we had just decided to quit and work on something else at that point, those performance gains would still have been worthwhile and valuable.

3. Beware of the shiny new AWS instance type—do your own testing, and don’t commit too early.

While on the subject of the i3s, I do have to issue a strong warning about experimenting with instance types. Do your research. Investigate errors others have encountered, read the fine print, and test under workloads that resemble your production workload as closely as possible.

In our case, we first tried out the i3 instances on one of our Kafka clusters, which promised better performance for less money than our previous setup due to the fact that we had been using large EBS attachments instead of free instance stores. I wrote a custom Kafka migration script to handle moving the huge amounts of data in our system without being too disruptive to anything upstream or downstream, complete with a goofy slackbot that would ping everyone with a deliberately obnoxious emoji whenever there was progress.

Screen Shot 2018-10-17 at 8.08.29 AM

Victory was at hand, until Kafka did what it does best and reminded us all that we’re merely foolish mortals and Trix are for kids. In dramatic fashion at 3AM one morning, the majority of the hosts in that Kafka cluster began crashing and spewing scary looking messages into syslog that looked like:

blk_update_request: I/O error, dev nvme0n1, sector 123290040

Everytime a Kafka process crashed, we would go and restart it, only to see it crash again a few minutes later, and always with those menacing nvme errors piling up.

Google seemingly came to the rescue, as we discovered a bug thread related to buffer I/O errors on i3 nvme devices. After scrolling through the thread long enough for my doubts to simmer and thicken into a demi-glace of suck, I eventually found a concrete recommendation that seemed too crazy to be made up, which also required a reboot of all the impacted hosts.

The recommendation turned out to just buy a bit of time at best. The next day, all of the hosts crashed again for the same reason. We brought back the custom Kafka migration script, naturally still with the deliberately obnoxious emoji notifications, and reluctantly moved all the data back to the original cluster so that we could regroup in peace. After a series of AWS support messages, we eventually realized i3 instances are only officially recommended for certain operating systems – and ours was not on the list.

The exact quote from the i3 launch announcement however was “In order to benefit from the performance made possible by the NVMe storage, you must run one of the following operating systems…”, which doesn’t sound quite like “if you don’t use one of the following operating systems, i3 will go Jack the Ripper on your poor unsuspecting town of peaceful machines”. That second thing would have been nice to know. We also discovered that the nvme errors typically only happened under heavy load, and that we could actually trigger them on a test i3 box by simply using a free benchmarking tool called Bonnie++

In true Amplitude fashion, what we actually ended up doing here was recreating all of the progress in the VPC migration up to that point and re-migrating everything onto ubuntu hosts so we could use the i3s safely, which only took a week.

As a relevant aside, this feels like a good spot to point out one of the elements of Amplitude’s culture I’m most proud to be a part of. This is not a company that wavers in the face of setback. When the going gets tough and the proverbial fan is being pelted, that’s actually when I feel like we’re at our best. Redoing the entire VPC migration at 10x the pace to unblock the performance benefits of i3s was something I was directly a part of, but it’s certainly not the only time in Amplitude’s history when we’ve pivoted to come back better than ever. I see it as a key part of what makes Amplitude Amplitude.

Engineering tip:

As a result of all this, my personal policy is to never use any new AWS instance type unless a few conditions are met:

  1. We have used Bonnie++ on a sample instance and have not been able to successfully break it
  2. The rollback procedure that will be employed at the first sign of trouble has been tested and is straightforward
  3. We have properly researched the problems with the new instance type that others are running into.
  4. We have carefully read all of the fineprint in the release announcement for the instance type and have made pessimistic assumptions about anything that is unclear.

4. In the absence of trust, a yearlong project isn’t going to work.

To pull off an engineering project of this scale, you need to wholeheartedly trust your team.

I know from personal experience that I probably would have failed to complete this VPC migration project at many other companies. I’ve learned a lot since the first time I attempted something of this sort, but I actually don’t think that can account for the dramatically different results on its own; what it really boils down to is that Amplitude has invested in an incredible engineering culture over the years, and it shows.

Design at Amplitude-2

Engineering tip:

Create a culture that empowers engineers to make the decisions they need to make to drive things forward instead of setting up roadblocks, and allows them to learn from their mistakes. While maintaining a culture like this requires an enormous amount of trust at every level of the company, I think every Amplitude engineer appreciates it tremendously, and it is a key part of what enables our success.

Screen Shot 2018-10-18 at 12.43.26 PM
About the author
Julien Dubeau

Julien Dubeau

Senior Engineer, Amplitude

More from Julien

Julien is a senior engineer at Amplitude.

More from Julien
Topics
Platform
  • AI Agents
  • AI Visibility
  • AI Feedback
  • Amplitude MCP
  • Product Analytics
  • Web Analytics
  • Feature Experimentation
  • Feature Management
  • Web Experimentation
  • Session Replay
  • Guides and Surveys
  • Activation
Compare us
  • Adobe
  • Google Analytics
  • Mixpanel
  • Pendo
  • Optimizely
  • Fullstory
  • LauchDarkly
  • Heap
Resources
  • Resource Library
  • Blog
  • Agent Prompt Library
  • Product Updates
  • Amp Champs
  • Amplitude Academy
  • Events
  • Glossary
Partners & Support
  • Contact Us
  • Customer Help Center
  • Community
  • Developer Docs
  • Find a Partner
  • Become an affiliate
Company
  • About Us
  • Careers
  • Press & News
  • Investor Relations
  • Diversity, Equity & Inclusion
Terms of ServicePrivacy NoticeAcceptable Use PolicyLegal
EnglishJapanese (日本語)Korean (한국어)Español (LATAM)Español (Spain)Português (Brasil)Português (Portugal)FrançaisDeutsch
© 2026 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.
Blog
InsightsProductCompanyCustomers
Topics

101

AI

APJ

Acquisition

Adobe Analytics

Agents

Amplify

Amplitude Academy

Amplitude Activation

Amplitude Analytics

Amplitude Audiences

Amplitude Community

Amplitude Feature Experimentation

Amplitude Full Platform

Amplitude Guides and Surveys

Amplitude Heatmaps

Amplitude Made Easy

Amplitude Session Replay

Amplitude Web Experimentation

Amplitude on Amplitude

Analytics

B2B SaaS

Behavioral Analytics

Benchmarks

Churn Analysis

Cohort Analysis

Collaboration

Consolidation

Conversion

Customer Experience

Customer Lifetime Value

DEI

Data

Data Governance

Data Management

Data Tables

Digital Experience Maturity

Digital Native

Digital Transformer

EMEA

Ecommerce

Employee Resource Group

Engagement

Event Tracking

Experimentation

Feature Adoption

Financial Services

Funnel Analysis

Getting Started

Google Analytics

Growth

Healthcare

How I Amplitude

Implementation

Integration

LATAM

LLM

Life at Amplitude

MCP

Machine Learning

Marketing Analytics

Media and Entertainment

Metrics

Modern Data Series

Monetization

Next Gen Builders

North Star Metric

Partnerships

Personalization

Pioneer Awards

Privacy

Product 50

Product Analytics

Product Design

Product Management

Product Releases

Product Strategy

Product-Led Growth

Recap

Retention

Revenue

Startup

Tech Stack

The Ampys

Warehouse-native Amplitude

Recommended Reading

article card image
Read 
Company
Founders’ Awards: Celebrate the Winners of 2025

Mar 6, 2026

12 min read

article card image
Read 
Insights
Behavioral Analytics for Fraud Teams: From Signals to Action

Mar 5, 2026

5 min read

article card image
Read 
Product
@Amplitude Has Entered the Chat: Agents in Slack and Teams

Mar 4, 2026

3 min read

article card image
Read 
Product
From Chaos to Clarity: How to Scale Your Analytics Taxonomy

Mar 3, 2026

11 min read

Explore Related Content

101
9 Top Feature Flag Solutions for Modern Product Teams in 2026

Jan 27, 2026

Integration
Using Behavioral Analytics for Growth with the Amplitude App on HubSpot

Jun 17, 2024

10 min read

Personalization
Identity Resolution: The Secret to a 360-Degree Customer View

Feb 16, 2024

10 min read

Product
Inside Warehouse-native Amplitude: A Technical Deep Dive

Jun 27, 2023

15 min read

Guide
5 Proven Strategies to Boost Customer Engagement

Jul 12, 2023

Video
Designing High-Impact Experiments

May 13, 2024

Startup
9 Direct-to-consumer Marketing Tactics to Accelerate Ecommerce Growth

Feb 20, 2024

10 min read

Growth
Leveraging Analytics to Achieve Product-Market Fit

Jul 20, 2023

10 min read