From Chaos to Clarity: How to Scale Your Analytics Taxonomy

A case study in moving from 500+ scattered events to a scalable, user-friendly data structure

Mar 3, 2026

11 min read

Salvatore Nastasi

Senior Customer Success Architect at Amplitude

When your product grows from a simple platform to a complex ecosystem, your analytics can quickly become a tangled mess. I have worked with several Amplitude customers to tackle this exact challenge. Each of their individual stories offers valuable lessons for any company struggling with analytics taxonomy at scale.

I want to use this post to share some lessons that I think everyone can use as they scale up and embrace that complexity. Let’s start with the basics.

What is an event taxonomy?

An event taxonomy is how you choose, name, and organize your data with events and their associated properties. Its purpose is to accurately capture the actions users take within a digital experience. These events are typically captured either specifically or automatically as users navigate that experience. They are then used to measure the effectiveness of the user journey against specific analytical goals, which are generally aligned with a company's broader objectives.

Why is an event taxonomy so important?

So why such a strong focus? An event taxonomy is the foundation on which everything is built. When your event taxonomy is clear, consistent, and relevant, the data feels intuitive and easy to work with. This naturally leads to new insights. When it’s confusing, inconsistent, or bloated, it quickly becomes a blocker for adoption and getting to insights.

The biggest problem: when growth outpaces governance

The challenge our customers face will resonate with many product teams. What started as a straightforward product has evolved into a multifaceted tool with many additional features, multiple digital properties across platforms, and several new critical user flows.

Along the way, their analytics structure becomes increasingly complex:

Dramatic increase in the number of events: analysis shifts from a quick task to a painstaking and time-consuming enterprise.
Inconsistent implementation: with each team creating events in silos, inconsistent naming and triggering patterns cause conflicts when projects are transferred. This also creates problems with cross-product analysis.
Tribal knowledge and team turnover: the company grows so fast that processes lag and tribal knowledge remains undocumented. This potentially leads to new joiners reinventing the wheel since they have no knowledge of preexisting standards or governance.

As a result of this complexity, governance begins to erode. The breaking point comes when someone new joins the team and finds the data “too messy” for meaningful analysis. A big red flag that I see is when even basic segmentation analysis has to be done outside of Amplitude because of the chaotic structure.

Our solution: standardizing on a noun+verb format

Rather than going with completely generic event naming, the majority of our customers have decided to implement a structured “noun+verb” format. This approach offers several key benefits:

Clear, consistent naming

Use descriptive phrases, so that event names follow a predictable pattern:

“File export completed” instead of “export_file”
“Registration completed” instead of “sign_up”

Also maintain consistent casing. I personally prefer title case with spaces for event names, and lower case with spaces for properties.

Strategic use of event properties

By using properties instead of unique event names for variations, teams can track detailed context while keeping the overall event catalog manageable. Just like your event names, property names should follow a predictable pattern.

The new structure leverages properties to add context without expanding event names. Here are some practical examples:

status property to understand one level down (e.g., success/error)
section/flow property to track which section of the product triggered the event (e.g., onboarding)
feature property to track which product area triggered the event (e.g., file manager)
origin/referring page property to track where the interaction was triggered from (e.g., top navigation, PDP)

This also futureproofs any potential changes to your product, since properties can have any number of values. Make sure that you clearly define the list of valid values to ensure proper naming standards and governance.

Making your taxonomy AI-ready

As soon as LLMs start consuming your analytics data, they need more than just a list of cryptic event and property names. Without clear naming conventions and rich metadata, the model has no way to infer what “signup_complete,” “exp_file_ok,” or “reg_v2” actually mean in your product. This confusion leads to vague or incorrect answers, missed insights, and a lot of manual clarification work on your side.

To make your data more readable, design your taxonomy so that a non-expert (including AI) can understand it at a glance. In practice, this means:

Use full, descriptive English phrases for key events and properties rather than terse abbreviations (e.g., “File export completed” is clearer than “export_file_ok”).
Add concise and clear descriptions to your top events and their most important properties so that the LLM can understand the context, when the event fires, and why it matters to your business.
Avoid ambiguous names like “Export file” or “Submission success” that could describe multiple points in a flow (click vs. completion, client vs. server, etc.). If you can’t change the name, use a very clear description that explains the nuance.

For example, instead of naming an event “Import file,” rename it to “File import completed” and document that it fires only once the file has been successfully imported and is ready for the user to use. This combination of explicit naming plus metadata gives LLMs enough context to answer questions accurately and to distinguish between similar steps in a flow.

The three-pillar optimization approach

Based on our customers’ experience, successful taxonomy transformation requires three primary components:

1. Use case-driven analysis

Understand your product flows before restructuring events. Map your taxonomy to the real user journey to ensure you support the most common business questions.

2. Event optimization roadmap

Audit existing events against best practices on a regular basis, prioritize changes by impact, and create a cleanup roadmap.

3. Data governance process

This is the most critical piece. It includes using metadata descriptions to document events and developing internal training material for ongoing team education. Without proper governance, even the best taxonomy will degrade over time. This is something that no automated solution can completely solve for you, although some features will certainly be handy to reactively clean up.

Key lessons for your taxonomy

Start with governance, not tools

Our customers’ real challenges aren’t technical—they’re organizational. Multiple teams create duplicate events with different prefixes, which results in more chaos than any naming convention could solve.

Resist the temptation to go (back to) generic

When facing complexity, teams often revert to basic tracking like “button_click, page_view” generic naming, because it’s neat and tidy. In my opinion, this represents a 10-year regression. It might solve short-term implementation challenges, but it creates long-term analysis and adoption problems.

Plan for scale and AI readiness

Your taxonomy should accommodate growth. Striking the right balance between abstraction and precision is an art, not a science. Remember to use clear, consistent names and add descriptions throughout to make it ready to be used with AI. You and your team will explore and define what works best in your own context as we’re entering into this new chapter of data management.

Making the change

If you’re facing similar challenges, consider this approach:

Audit your current state: How many events do you really have? Which teams are owning/creating what? What is the current governance structure?
Design or refine your structure: Choose “noun+verb” format with strategic property usage
Create a prototype: Build surrogate events to test usability before implementation
Contextualize: Add relevant context and descriptions directly in Amplitude to increase LLMs response accuracy. Alternatively, ensure your data dictionaries are kept up-to-date in a centralized solution that the LLM can also access.
Implement governance: Train teams, establish ongoing processes or build upon existing ones, then revamp as needed
Migrate gradually: Phase the transition to avoid breaking existing analysis

Invest in your data taxonomy

Good taxonomy isn’t just about clean data—it’s about enabling your teams to move fast while maintaining analytical rigor. As our customers have discovered, the investment in structured thinking pays dividends when your teams need reliable data or when new team members join and need to understand your product quickly.

The goal isn’t a perfect taxonomy from day one. It’s building a system that can evolve with your product while keeping your data comprehensible and actionable.

About the author