Warehouse-native Amplitude: Best Practices

Warehouse-native Amplitude enables you to bring your own models to your analyses. However, to get the most out of your data as quickly as possible, you should consider these best practices:

Clustering key

  • Choose appropriate columns for clustering keys based on the query patterns and filtering conditions in your analytics workload.
    • For example, for event (fact) tables, cluster on event time using the LINEAR() function.
  • Avoid using columns with high cardinality as clustering keys. This can lead to inefficiencies in data storage and query performance.
  • Use composite clustering keys with multiple columns often used in join operations, or for filtering.

Schema format

  • Use a star schema or Snowflake schema to optimize query performance and simplify data analysis.
    • Star schema: features a central fact table linked to dimension tables, suitable for simpler queries and faster aggregations.
    • Snowflake schema: a normalized version of the star schema, which minimizes data redundancy and improves data integrity at the cost of more complex queries.

Partition and clustering

Partition large tables to reduce the amount of data scanned, and improve query performance.

Was this page helpful?

Thanks for your feedback!

June 4th, 2024

Need help? Contact Support

Visit Amplitude.com

Have a look at the Amplitude Blog

Learn more at Amplitude Academy

© 2024 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.