Data Gone Wild: A Survival Guide to Event Taxonomy
For today's How I Amplitude, we're joined by Ofir Zwebner, Application Architect at Houzz, who provides a comprehensive framework for building sustainable event taxonomies. Through practical examples, Ofir demonstrates how to deconstruct your user journeys to create a data model that will ultimately define your tracking plan and event design.
“When a new team in my org comes to me and says ‘I’ve got a new feature, how do I track it,’ my first question back to them is, ‘what are the business outcomes?’”
Key Highlights
In this article, Ofir breaks down common pitfalls in event tracking, provides specific solutions for data governance, and shows how to build an event model that aligns with real user journeys while maintaining consistent, clear data structures.
The piece is long, detailed, and provides thorough context. If you want to jump down to specific parts, use the links below!
- How to properly categorize events into four key types:
- .
The Problem
Today I hope to help you understand why "track everything, make sense of it later" is bound to fail. The flip side of this means we need to find a way to scale your event tracking with proper data governance. And create your event taxonomy that makes sense.
To break this down, I'll cover the following:
- Defining events and properties based on your product assets and user paths
- Identifying Actions and Attributes
- Event naming that reflects the world of the user
- Making the data easy to learn (Data Democratization)
- Granularity of events
- Consistent style and verb tense
Finally, we'll put these concepts into practice by creating a user event model for Google Slides.
Exciting, right? Let’s dive in. 😊
We’ll use a hypothetical product to build our use case: a music platform (though it could be anything).
The setup is simple: a simple web client and server sending events to their CDP, which forwards them to both Amplitude (itself a CDP) and a data warehouse.
It seems so straightforward, so manageable. With product-market fit still uncertain and investors demanding rapid iteration, who has time to think deeply about data governance?
"Look at all these screens!" their enthusiastic Head of Growth. “Just track everything! We'll make sense of it later!"
Why tracking everything doesn’t work (even in the short-term)
The first major challenge I often see is the addition of mobile apps.
With iOS and Android, you’ve now got multiple codebases (in different languages) across platforms, different app versions in production, and multiple teams needing to sync on implementation. With just this common use case, things get out of control quickly right out of the gate.
With growth also comes the need for new products.
Take the music platform in our example. They serve listeners, but they also serve musicians who need their own services and apps, all of which require their own tracking.
This inherently creates multiple products, with end users who may not have the same goal. This is a pretty common use case for marketplace or SaaS products.
Next come the third-party integrations
Since many products sit within an ecosystem, your users will naturally ask for integrations to connect your product into their existing workflows.
This means events will start flowing back and forth between marketing tools, customer support systems, and sales platforms. Your analytics system is no longer just tracking data; it's become critical production infrastructure with inconsistent naming conventions.
Finally, someone mirrors the production database to an offline system and starts running their own analytics. Suddenly, the numbers in Amplitude don’t match the numbers from Engineering's SQL queries. Our Head of Growth finds himself in endless review meetings, trying to explain why everyone's charts tell different stories. "But the data comes from the same place!" he says. "How can the numbers be different?"
This is, to use a technical term, chaos.
The Challenges of Tracking at Scale
- Software Platforms: Different developers, from different teams, writing different code using different languages, running multiple live code versions on different platforms
- Products: Multiple product features or product lines sharing the same data plan; tracking for users, teams, contracts or projects; customer support impersonation
- Integrations: 1st and 3rd parties introduce new terminology, contradicting semantics and add operational dependency on tracking, non-user (data exchange) events
- Trust and Accountability: Inconsistencies between tracking data and other data sources raise questions about data validity
- Legal Compliance: Handling ownership, data access requests, data privacy audits
- Data Discovery and Democratization: How to make taxonomy accessible to all employees in an organization; how to handle and grasp thousands of assets.
- Velocity: Make sure governance doesn’t slow down the team from getting results
But fear not, dear reader. I’ve gone through all of this so you don’t have to.
Today, I’ll share some ways to turn this messy setup around.
Part A: One Solution: Data Governance
To track at scale without losing your mind, I recommend a number of solutions: Unified Event Protocol, Developer APIs, Data Plan Observability, and Data Quality Observability.
I cover each of these briefly in my presentation if you want to dive deeper. And you can find me in the if you want to go deeper.
Today, I’ll focus on two important pieces of Data Governance:
- Building a data model around the user’s journey
- Defining the taxonomy guidelines
Let’s get started.
Let’s start with a simple user story:
A User Walks Into A Website:
A user lands on a music website ➡️ They navigate to an artist page ➡️ User searches for a song and tries to add it to a playlist ➡️ The user then has to create a new playlist ➡️ Navigate to that playlist ➡️ And finally, they play the song 🏆
We'll call this User Story A. Here’s a traditional way of describing this with events:
In my estimation, there are three main issues:
Issue #1: Too hard to read event names.
Issue #2: Too hard to identify significant events.
Issue #3: Too many event names - every single item, element, action, etc having their own event name will cause an explosion of the namespace aka not good.
Now, let’s retell this story with a focus on solving the problems above.
User Story A - V2: Build Around the User Journey
The user is on their homepage, then clicks to go to the artist page. Now we have a small funnel starting - adding a song to a playlist.
From there, user clicks and is prompted to create a new playlist. User approves and submits the playlist name.
Behind the scenes, we create a playlist and then add the song to that playlist.
User is then taken to the playlist page and clicks to start playing music.
Make it a model - good data governance scales well
Let’s extrapolate a model from that story. Take a look at the model to the left.
Note that I’ve identified and color-coded four different types of events:
- 🟢 Business Outcomes
- 🔴 Screens
- 🟡 Funnel Steps
- 🔵 UI Interactions
Let's get a closer look at the four event types
These four event types differ in a few key ways, which I’ve outlined below. Most importantly, they vary in how conclusive they are in helping us determine whether something actually happened - business outcomes are the most conclusive and UI interactions are the least conclusive.
Finally, we'll overlay User Story A into our model
Here’s the original event taxonomy with my color coding applied:
Our final tracking plan for User Journey A
And here’s my simplified model utilizing Amplitudes property names to illustrate what the action is.
What makes this approach better?
Keeping it simple
- By using a single "Screen View" and single "Click" event instead of multiple for each Screen View and Click, we keep things very simple.
- By letting the properties carry the context, we keep the Event names simple.
- Doing the above makes it a lot easier to identify the significant events we want to spend more time with.
Consistency lets us ask better questions
- Because we’re using predefined properties that are consistent across all events, we can slice the data and start asking interesting questions like:
- How did I get this outcome?
- What was the screen associated with a certain outcome?
- What were the funnels that led to a certain outcome?
- What’s the funnel completion ratio?
If You Remember Only One Thing
Let it be this: focus on the business outcomes first.
When a new team in my org comes to me and says “I’ve got a new feature, how do I track it,” my first question back to them is, “what are the business outcomes?”
Then I ask about the navigation aspects. Then I ask about screens and maybe we create some funnels. And finally, if they really want to track UI, sure, I won’t stop them, but I show them why tracking UI is far less impactful than the business outcome itself.
Creating your taxonomy guidelines
Once you’ve developed the broad strokes of the data language you’ll be speaking, it’s time to define how to speak that language.
In other words, it’s time to create your taxonomy guidelines. I know, it’s very exciting, be sure you’re sitting down and have someone nearby in case you faint from joy.
But trust me here - your future self is looking back at you now and begging you to do this right from the start.
Soon you’ll put all of these skills into practice with a Google Slides event flow, so get ready.
Ramping up
Before going further, make sure you’re familiar with everything here - - that's 101 stuff and won't be covered below.
Choose Your Naming Convention
There’s no wrong answer to which style you choose, but in my humble opinion, there’s definitely a right answer:
Chicago Style Title Case Is a Great Choice for a Successful Data Plan
It's readable, consistent, and gives us a reliable baseline that we can convert to any other case if needed.
And most importantly, Amplitude supports Title Case across events, properties, and property values:
Define Your Glossary
Everyone loves talking about subject-action patterns and noun-verb relationships, but none of that means anything until you create a glossary.
Your glossary defines what your subjects are - the things receiving the actions - and the actions themselves, and that’s just the tip of the iceberg:
- Subjects (nouns and compound nouns)
- Actions (verbs)
- Common Abbreviations
- 3rd Party Terms
- Namespaces
- "Block Words"
- Name Separators
For a detailed walkthrough of each of these, click .
Across each of these, you’ll need to make decisions on:
- Casing and Hyphenation
- Singular and Plural forms
- Present and Past tenses
- Readability, Clarity and Brevity
- Adherence to common usage conventions in your domain (in my company, we look at home remodeling and have terms like 'floor plan')
- Choice of ONE language (especially important for international sites)
A few of the things I always fight for in my glossary are:
- Using "Visitor/Visitors" instead of longer terms like "Anonymous User"
- Proper hyphenation for compound terms like "Signed-In User"
- Consistent handling of compound words (e.g., Playlist as one word)
- Using clear, semantically correct terms even if longer (e.g., "Phone Number" vs just "Phone")
- Standardizing common abbreviations (e.g., "Sales Rep" vs "Sales Representative")
Naming conventions by event types
Remember our four event types?
- 🟢 Business Outcomes
- 🔴 Screens
- 🟡 Funnel Steps
- 🔵 UI Interactions
It’s important we have distinct naming conventions for each.
Business Outcomes
When it comes to your most important events - the business outcomes - you want to follow a clear but flexible pattern.
Start with an optional namespace if you need one, then your subject, the action, and maybe a prepositional phrase to make it crystal clear.
Avoid these mistakes
- Avoid ambiguity
- Avoid redundancy
- Focus on the subject, not the actor
- Use affirmative terms
You have two ways to handle screens.
The first is super clean - just use one "Screen" event and let a property handle the specific screen name.
Your screen names should follow the pattern: [Namespace] [Subject] - [Activity [preposition]].
Funnels are all about tracking a journey, so name them accordingly.
Start with your feature name, then add whether it's starting, ending, or what step you're on.
Always bookend your funnels with Start and End events - these are crucial.
If your product team makes big changes to a funnel, don't be afraid to append a “v2” on there. Let your product team name the steps (they know the product best), but keep them in your consistent format.
The key with UI interactions? Keep the event name simple.
UI Interactions: Container (Feature) and Element (Function)
There aren’t that many things that someone can do when interacting.
The real differentiation for each interaction comes in the property values that define:
- what area of the screen (container) are we talking about? and
- what specific thing (element) did the user interact with?
Putting It All Together - Let’s Test Your Knowledge
Confused yet? Good. This stuff isn’t easy, I know, but it’s 10x easier than the alternative - decoding a tangled mess of disorganized data at 3AM on a Saturday because your CEO just discovered Meta analytics and now has 47 'quick questions' about why their numbers don't match yours". Trust me.
The best way to get comfortable with all this is practice, so let’s do that now.
If you were in charge of naming this Google Slides screen for tracking purposes, what would you call it?
Here’s what I’d recommend:
✅ "Google Slides - Edit"
And here’s why:
- Clear product name ("Google Slides")
- Simple hyphen separator
- Action/view type at the end ("Edit")
- Consistent with other screens like "Google Slides - Present" or "Google Slides - Browse"
- Scales nicely across the whole application:
- "Google Slides - Present"
- "Google Docs - Edit"
- "Google Sheets - Edit"
- Easy to read in analytics dashboards
- Easy to group and filter
See the pattern? [Product] - [View Type]. Simple, consistent, clear.
Now let's tackle UI Interactions
In order to properly define our Business Outcomes and Funnels, we first need to name our containers and elements.
What would you name each of these containers?
I've added below how I'd name these events
Now what about the elements?
It’s harder to see these so I’ll just show you what I named each one. Big thing to note here is that when you see the same thing repeated (like each “Theme Card” in the “Theme Selector”), just use the same name for all of them.
Why? Because you can't predict how many there'll be, and you don't need to. One "Theme Card" name covers them all.
A Real Example: The Google Slides’ Share Flow
Let’s put it all together and, using the naming system above, outline what events happen when someone clicks “Share” on Google Slides. You may want to open a new Google Slide on your own (go to and go through the actual flow).
Provide the following for each event in the flow:
- Screen Name
- Funnel Name (if applicable)
- Container Name
- Element Name
(pause here to try it on your own before reading below)
Ready?
Here’s how I broke it down:
To recap the Google Slides exercise
For a detailed walkthrough, click .
One thing to focus on: why did I code the Slides Name Updated as a Business Outcome?
Because something actually changed in our system. The presentation has a new name in the database. Unlike UI clicks which might or might not lead to changes, this represents a real, permanent change to the data. It's the kind of thing product managers care about when measuring feature adoption.
See how we have different types of events telling different parts of the story?
- 🔵 UI Interactions tell us HOW users are doing things
- 🟡 Funnel Steps tell us WHERE users are in a process
- 🟢 Business Outcomes tell us WHAT actually happened
That last one is crucial. There's a big difference between "user clicked save" (UI Interaction) and "slides name was updated" (Business Outcome). The click might fail, the network might be down, but a business outcome means something actually changed. That's why we track them separately and give them special attention.
Wrap up
Look, I know this all sounds like a lot of upfront work - and it is! But it's like a Roth IRA (stick with me here): you pay those taxes upfront, but later on? Tax-free gains.
Same principle applies. Get in early with the product team, make sure management has your back, write everything down clearly, and be a little flexible when you need to be. Will everyone follow every rule perfectly? Absolutely not.
But that’s exactly why you need to have systems in place that won’t buckle under the pressure of a few incorrect entries.
Just remember: anyone who tells you to "track everything and figure it out later" is setting you up for a world of pain down the road.
Join the community!
Connect with Ofir and other users in our Amplitude community! We focus on actionable programs, sharing best practices, and connecting our members with peers and mentors who work on similar things.