3 Challenges Data Strategists Face and How to Overcome Them

Learn the top three data challenges I face as a data strategist and my tips and tricks for overcoming them.

Perspectives
July 22, 2024
Jennifer Rudin headshot
Jennifer Rudin
Senior Data Strategist, Amplitude
Paper airplane navigating through paper balls

Everyone wants to be data-driven. But it doesn’t happen by magic. In companies that are using data to inform their decisions in a smart way, you’ll likely find a set of data strategists working hard behind the scenes to create that magic.

As a data strategist, I have a lot going on. On any given day, you might find me analyzing data to identify user behavior changes, figuring out how new data will fit into our existing ecosystem, or fixing data anomalies. Amid this work, I also face a host of challenges—from communicating with stakeholders to cleaning up messy data.

Overcoming these challenges enables me to focus on more strategic work that benefits the entire organization. It means everyone can base their decisions on more meaningful data insights.

In this piece, I’ll explore some key challenges that I and other data strategists face. I’ll also share practical solutions to help make your life as a data strategist easier so that more companies can become data-driven.

Key takeaways
  • Data strategists play a crucial role in organizations by turning data into actionable insights, but they face significant challenges.
  • Poor data quality hampers analyses, emphasizing the need for robust data governance and hygiene practices.
  • Dealing with a large workload involves open communication with stakeholders to manage expectations, making the data request backlog visible, and breaking requests into smaller, tangible tasks.
  • Communicating a complete and accurate story for stakeholders requires using multiple data sources, thorough documentation, and effective storytelling that frames analyses in relatable human terms.
  • Data platforms like Amplitude help with storytelling and data hygiene by facilitating data cleaning, merging events, and transforming data.

1. Dealing with poor data quality

Poor quality data—incomplete, unstructured, or siloed across different systems—causes an uphill struggle for data strategists. When organizations don’t realize the importance of data hygiene, data strategists spend more time untangling and cleansing data and less time aggregating and synthesizing important business insights.

Foster a culture of strong data governance

Get leadership buy-in for data governance and champion stakeholders who are passionate about high-quality data and educating others about the importance of data hygiene. Emphasize the toll that poor-quality data can take on sound business decisions by reminding people that:

  • Quality decisions are backed by quality data.
  • Personalization relies on time-sensitive data updates.

Strategists can also help create a data governance framework and support it by creating educational resources for the rest of the organization. For example, a data dictionary that documents events and properties helps people learn about user behavior events and contextualize where they are on the website or app. Internal data quizzes also reinforce relevant data knowledge.

Use tools that simplify data hygiene

As products evolve, the underlying data evolves. Features can be renamed, multiple products can be collapsed into a single functionality, and how we define active users or power users changes over time. Amplitude offers flexibility to merge duplicate events, rename property values, build custom events, drop and block invalid or antiquated data, and transform historical data into newly tracked data to bridge the gaps when instrumentation evolves.

Amplitude in action

How we keep our data clean

When Amplitude released Session Replay, our Amplitude taxonomy accrued 25 new events—and counting. As a data strategist, staying aware of evolving taxonomies is crucial, especially when accounting for user-generated events versus system-generated events. To support new product releases effectively, you want to have a clear understanding of which events are core to that product usage. That means alignment on the definition of activation events.

We created custom events for Session Replay to analyze product adoption. The custom events gave us a consistent definition of “active usage of Session Replay” across all internal Amplitude users in both charts and cohorts. This reduced data misuse across teams and ensured consistent customer adoption counts across all inventory, leading to stronger data trust.

Amplitude also enables data transformation through derived properties. Derived properties are newly created properties with retroactive data. They stem from applying various functions and operators to existing property data and help turn raw data into more meaningful information.

We’ve built various derived properties to support our analytics and cohorting needs, including deriving the days since and days until a contract begins and ends (see this Community post for the technical setup). This enables us to suppress certain accounts from our growth marketing targeting that have an upcoming contract end date in the next ‘n’ days.

We also collected customer satisfaction survey responses in a long string format, which was difficult to analyze. By mapping these responses to numerical values in Amplitude, we created a new property that was backfilled onto the responses. From there, I could easily run numerical analysis to figure out things like the average survey response and the distribution of responses (see this Community post for the technical setup.)

No-code data governance within Amplitude empowers data governors to correct common implementation mistakes for cleaner insights without the need for engineers to update their code base. This guarantees a threshold of data consistency when product instrumentation evolves. It’s easy to take action and own these end-to-end changes, like setting up drop filters to remove duplicates and unnecessary noise from incoming data.

2. Prioritizing workload and time

Everyone who requests analyses from a data strategist wants results ASAP. For the strategists on the receiving end, juggling those multiple ‘urgent’ requests is overwhelming and stressful.

Collaborate with stakeholders on analysis prioritization

Strategists can streamline prioritization by engaging stakeholders in impact vs. effort discussions when they make requests. Sometimes, people ask analysts to investigate something out of curiosity without a specific goal. Other times, the analysis is crucial for making informed decisions, such as shaping your monetization strategy for free-to-paid conversion.

To ensure effective prioritization, encourage stakeholders to set clear expectations and be realistic about the urgency of their requests. Start this expectation-setting exercise at the start of each quarter, anticipating upcoming analytic needs, but ensure there is flexibility baked into your roadmap. Additionally, make the analysis backlog visible—ideally to the entire organization but at least your team and your manager. This transparency enables others to see the tasks in progress and work together to agree on data analysis priorities that align with business goals.

Break analysis down into smaller tasks

Another way for strategists to make their workload more manageable is by narrowing the data scope. Data is messy and can be overwhelming to dig into sometimes, especially when a stakeholder makes a vague request.

One request could be as vague as “What did Q1 churn look like?” A million questions come to mind on how to slice the view for churn:

  • Count vs. percentage of churn by logos
  • Count vs. percentage of churn by annual recurring revenue
  • Verbatim survey responses on reasons for churn
  • Lack of usage leading up to churn
  • Underutilization vs. overutilization
  • The lifespan of customers on the plan prior to churn
  • Churn buckets (Remember: Not all churn is bad churn, especially when an organization upgrades to a new plan.)

We might also want to compare Q1 to previous quarters and model upcoming churn predictions based on Q1 and prior trends.

Paired with each of these options is a different data set you can use for analysis. We might have more historical churn survey responses stored in one system, but there are more recent and standardized survey responses stored in a new system. How critical historical insights are prior to Q1 compared to the most recent insights will guide our efforts to merge and transform both datasets.

It’s important to ask questions to clarify the full scope of the request and ensure alignment on the expected final analysis. You can spend hours investigating deep insights, but sometimes, a stakeholder just needs a quick, high-level response.

My process tends to look something like this:

  • Identify the priority of the request.
  • Identify the underlying data points and data sources required for the analysis.
  • Identify the data quality and data trust for those sources.
  • Document everything.
  • Share iterative results from the analysis with any known data caveats and open a discussion.

The key here is to communicate openly with stakeholders. The scope of the initial request may evolve, and that’s fine. Opening discussions helps you have a more healthy conversation about the analysis and available datasets. They may also reveal a clearer direction for the analysis.

Share a reasonable deliverable roadmap—for example, “I can tackle X part of the analysis in the next two weeks and should be able to provide the final analysis by the end of the third week.” Regularly check in with the stakeholders about the analysis request. That way, you receive feedback about whether you’re on track to reach the goal and can factor in any recent changes to your analysis.

3. Communicating with stakeholders

Sometimes, data only tells you one side of the story. And even when you’ve checked multiple data points and have a complete picture, you may still have to tell a stakeholder something they don’t want to hear: The analysis disproves their hypothesis.

Widen your data lens

In both cases, the solution is to tell the most comprehensive story possible. Consider where you can tap into various data sources based on factors such as:

  • Historical data. How far back does your analysis need to look?
  • Accuracy. What level of accuracy is required for the analysis to be reliable?
  • Structure. What data structures are most appropriate for the type of analysis?
  • Storage access permissions. Do you have the right permissions to access and use the data needed for your analysis?

In addition to behavioral data in your product, you’ll want to look at survey feedback, interviews, and session replays.

Interrogate your findings. Let’s say you’re analyzing the impact of promo codes for a software subscription service. Ask questions about the data to help you get a more holistic view of customer behavior, for example:

  • Which actions are users taking on their way to checkout?
  • What is the first-touch vs. last-touch attribution to the checkout? For example, would the user still have purchased the subscription without the promo code, or did an email campaign offering the promo resurrect a dormant user and drive them to return to the website and complete their checkout?
  • Does the promo code incentivize larger transactions vs. smaller transactions?
  • Does the promo code incentivize longer annual subscriptions vs. shorter monthly subscriptions?
  • Does the promo code reduce the product value to the user?
  • Are we setting price expectations we won’t be able to sustain?

Tell a relatable and actionable story

Data storytelling requires data synthesis plus the extraction of key findings and actionable next steps from the data. This begins with clear documentation. Document the problem that drove the analysis and all relevant hypotheses. Note the data sources used for the analyses and call out any caveats.

For example, specify if there isn’t much historical data or if a seasonal impact caused an anomaly, resulting in a dip or spike in numbers. And always consider a lens that looks beyond the numbers. Empathize with the end users to understand their motivations, incentives, and jobs to be done using your product.

Frame your story around how humans behave and make it relatable. Draw links between your analysis and experiences people are familiar with in other products so stakeholders can imagine themselves in the user’s place.

Amplitude Notebooks amplify storytelling with relevant and timely insights from embedded, dynamic charts and cohorts. Simplify insights sharing by bringing teammates directly to the data source, where they can expand on specific analyses and monitor trends more deeply. Combine charts with key takeaways, hypotheses, and important context, then mention your colleagues in comments to crowdsource their feedback and increase visibility.

Start your path to better data

By addressing challenges such as data quality, workload prioritization, and effective communication, you can unlock the full potential of your company’s data. You’ll be able to focus on asking the right questions and know that you can trust the answers. You’ll help your organization get more high-quality insights, faster. The right tool will help you get there.

Dreaming of a world where data is cleaner, more accessible, and easier to manage? Get started with Amplitude for free today, or check out The Amplitude Guide to Behavioral Data & Event Tracking.

About the Author
Jennifer Rudin headshot
Jennifer Rudin
Senior Data Strategist, Amplitude
Jennifer is Amplitude’s Sr. Data Strategist, focused on integrating Amplitude into her everyday work. Her intentions are to break down data silos, increase data literacy, and build confidence in our internal data infrastructure.

More Perspectives