Profiles

Profiles enable you to join customer profile data from your data warehouse with existing behavioral product data already in Amplitude.

Note

This feature is currently in an open beta.

Profiles act as standalone properties, in that they aren't associated with specific events and are instead associated with a user profile. They're different from traditional user properties and offer the opportunity to conduct more expansive analyses.

Profiles always display the most current data synced from your warehouse.

Before you begin

Regardless of whether you're using Snowflake or Databricks, Change Data Capture (CDC) doesn't support replacing existing tables. Instead, you must use incremental modeling. If the table you integrate with drops and replaces data, the connection breaks.

Snowflake users

If this is your first time importing data from this table, set a data retention time and enable change tracking in Snowflake with the following commands:

1ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET DATA_RETENTION_TIME_IN_DAYS = 7;
2 
3ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET CHANGE_TRACKING = TRUE;

On Snowflake Standard Edition plans, the maximum retention time is one day. If you’re on this plan, you should set the frequency to 12 hours in later steps.

Databricks users

Follow these instructions to enable change tracking:

  • If you're working with a new table, set the table property delta.enableChangeDataFeed = true in the CREATE TABLE command: CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.enableChangeDataFeed = true)

    Also set spark.databricks.delta.properties.defaults.enableChangeDataFeed = true for all new tables.

  • If you're working with an existing table, set the table property delta.enableChangeDataFeed = true in the ALTER TABLE command: ALTER TABLE myDeltaTable SET TBLPROPERTIES (delta.enableChangeDataFeed = true)

Set a data retention period. This must be at least one day, but in most cases you should set this period to seven days or longer. If your retention period is too short, the import process can fail.

Set up a profile (Snowflake users)

To set up a profile in Amplitude, follow these steps:

  1. In Amplitude Data, navigate to Connections Overview. Then in the Sources panel, click Add More. Scroll down until you find the Snowflake tile and click it.

  2. On the Set Up Connection tab, connect Amplitude to your data warehouse by filling in all the relevant fields under Snowflake Credentials, which are outlined in the Snowflake Data Import guide. You can either create a new connection, or reuse an existing one. Click Next when you're done.

  3. You can see a list of your tables under Select Table. To begin column mapping, click the table you're interested in.

  4. In the list of required fields under Column Mapping, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click + Add field.

  5. On the Select Data tab, select the profiles data type. Amplitude pre-selects the required change data capture import strategy for you, which you can see under the Select Import Strategy dropdown:

    • Insert: Always on, creates new profiles when added to your table.
    • Update: Syncs changes to values from your table to Amplitude.
    • Delete: Syncs deletions from your table to Amplitude.
  6. When you're done, click Test Mapping verify your mapping information. Then click Next.

  7. Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. You should set the frequency to 12 hours if you are on Snowflake Standard Edition.

Set up a profile (Databricks users)

To set up a profile in Amplitude, follow these steps:

  1. In Amplitude Data, navigate to Connections Overview. Then in the Sources panel, click Add More. Scroll down until you find the Databricks tile and click it.

  2. In the Set Up Connection tab, connect Amplitude to your data warehouse. Have the following information ready:

    • Server hostname: This is the hostname of your Databricks cluster. You can find it in your cluster configuration by navigating to Advanced Options -> JDBC/ODBC -> Server Hostname.
    • HTTP path: This is the HTTP path of the cluster you would like to connect to. You can find it in your cluster configuration by navigating to Advanced Options -> JDBC/ODBC -> HTTP Path.
    • Personal access token: Use the personal access token to authenticate with your Databricks cluster. Learn how to create them here.

    Click Next when you're done.

  3. You can see a list of your tables under Select Table. To begin column mapping, click the table you're interested in.

  4. In the list of required fields under Column Mapping, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click + Add field.

  5. In the Data Selection tab, select the profiles data type.

  6. When you're done, click Test Mapping to verify your mapping information. Then click Next.

  7. Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. The default frequency is 12 hours, but you can change it.

Data specifications

Profiles supports:

  • Up to 100 million users
  • Up to 200 profile properties
  • Up to 200 warehouse properties
  • Known Amplitude users

A user_id must go with each profile.

Field Description Example
user_id Identifier for the user. Must have a minimum length of 5.
Profile Property 1 Profile property set at the user level. The value of this field is the value from the customer’s source since last sync.
Profile Property 2 Profile property set at the user level. The value of this field is the value from the customer’s source since last sync.

Example:

1{
2 "user_id": 12345,
3 "number of purchases": 10,
4 "title": "Data Engineer"
5}

See this article for information on Snowflake profiles.

SQL template

1SELECT
2 AS "user_id",
3 AS "profile_property_1",
4 AS "profile_property_2"
5FROM DATABASE_NAME.SCHEMA_NAME.TABLE_OR_VIEW_NAME

Clear a profile value

When you remove profile values in your data warehouse, those values sync to Amplitude during the next sync operation. You can also use Amplitude Data to remove unused property fields from users in Amplitude.

Sample queries

1SELECT
2 user_id as "user_id",
3 upgrade_propensity_score as "Upgrade Propensity Score",
4 user_model_version as "User Model Version"
5FROM
6 ml_models.prod_propensity_scoring
1SELECT
2 m.uid as "user_id",
3 m.title as "Title",
4 m.seniority as "Seniority",
5 m.dma as "DMA"
6FROM
7 prod_users.demo_data m
Was this page helpful?

Thanks for your feedback!

October 1st, 2024

Need help? Contact Support

Visit Amplitude.com

Have a look at the Amplitude Blog

Learn more at Amplitude Academy

© 2024 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.