Snowflake
Load your Amplitude event data into your Snowflake account. You can set up recurring syncs through the Amplitude UI, as well as manually start a sync of your historical data.
Considerations
Depending on your company's network policy, you may need to add these IP addresses to your allowlist so Amplitude's servers can access your Snowflake instance:
- Amplitude US IP addresses:
- 52.33.3.219
- 35.162.216.242
- 52.27.10.221
- Amplitude EU IP addresses:
- 3.124.22.25
- 18.157.59.125
- 18.192.47.195
- Amplitude US IP addresses:
Limits
- Maximum running time for a single Snowflake SQL query is 12 hours.
- Each project can have only one Snowflake Export destination for each data type. This means one destination for events and one destination for merged IDs per project.
Reuse existing Snowflake credentials
When you create a new Snowflake export or import, Amplitude lets you select previously saved credentials from your organization instead of entering them again. Reusable credentials include connections from other exports, imports, and connections made in other projects (if you have permission).
Shared credentials update together
Credentials are shared at the underlying level. If you update the password or key pair on one connection, Amplitude updates all connections that share those credentials. Review which connections use the credentials before making changes.
To reuse existing credentials, select Use existing credentials at the credential entry step when creating a new connection. Amplitude lists all saved credentials from your organization. To reuse credentials from another project, you need permission to create Data Warehouse connections in both projects.
Set up the integration
Creating a recurring data export is a three-step process you can handle yourself. Each sync typically completes within five to ten minutes, though timing may vary depending on system load and data volume. This method also lets you watch jobs.
Use a dedicated warehouse for Amplitude data
Amplitude strongly recommends using a dedicated Snowflake warehouse instance for loading Amplitude data. Using a shared warehouse can cause load capacity issues and disrupt other Snowflake integrations. Configure a separate warehouse to ensure optimal performance and prevent conflicts with your other Snowflake operations.
To set up a recurring export of your Amplitude data to Snowflake, follow these steps:
Required user permissions
You need admin/manager privileges in Amplitude, and a role that allows you to enable resources in Snowflake.
In Amplitude Data, click Catalog and select the Destinations tab.
In the Warehouse Destinations section, click Snowflake.
Under Export Data to Snowflake, choose which data to include in this export: Export events ingested today and moving forward, Export all merged Amplitude IDs, or export both. For events, you can also specify filtering conditions to export only events that meet certain criteria.
Review the Event table and Merge IDs table schemas and click Next.
In the Snowflake Credentials For Amplitude section, enter the following information:
- Account Identifier: The account identifier on your Snowflake account, in the format
<org-name>-<account-name>. For more information, refer to Snowflake's account identifier documentation{:target="_blank" rel="noopener noreferrer"}. - Warehouse: The warehouse Amplitude uses to load the data. Use a dedicated warehouse for Amplitude data to prevent load capacity issues with other Snowflake integrations. Sharing a warehouse can disrupt your other Snowflake operations.
- Database: The database where Amplitude stores data. Dedicate this database specifically to Amplitude data.
- Role: The role that the connection uses to write this data. The default value of role is
AMPLITUDEonly. - Username: The username Amplitude uses to connect to the Snowflake account.
Amplitude offers password-based and key pair authentication for Snowflake.
- Account Identifier: The account identifier on your Snowflake account, in the format
Snowflake Password Authentication Deprecation
Starting in May 2026, Snowflake is removing support for single-factor password authentication. This affects data exports from Amplitude to Snowflake. Amplitude recommends migrating to key pair authentication for enhanced security and future compatibility with Snowflake. For migration guidance, review the Snowflake Password Authentication Deprecation FAQ.
If you use password-based authentication, note that your password is case-sensitive.
To use password authentication, select the Password option and enter your password in the Password field. Key pair authentication (Recommended): To use key pair authentication, select the Key pair option and click Generate Key. To use key pair authentication, provide the organization and account names in the format ORGNAME-ACCOUNTNAME.
When using key pair authentication, you must use the format ORGNAME-ACCOUNTNAME. Without this specific format, Snowflake generates a JWT token is invalid error.
- Copy the autogenerated SQL query and run it in Snowflake to give Amplitude the proper permissions.
- Click Next. Amplitude attempts to upload test data using the credentials you entered. If the upload is successful, click Finish.
Amplitude automatically sends all future events to Snowflake.
From here, Amplitude generates micro-batch files and loads them to customer-owned Snowflake accounts on a best-effort basis. Exports typically run every 10 minutes, but may run less frequently depending on system load and data volume. You can typically see the data in your Snowflake accounts within 20 minutes after Amplitude receives the events, though timing may vary.
Export historical Amplitude data to Snowflake
To export your historical data from Amplitude into Snowflake, navigate to the integration, then click Export Data and select a date range.
This process can take from a single day to several weeks. It depends on your data volume, warehouse size, cluster count, network bandwidth, and the number of concurrent historical data exports you currently have, among other factors.
Reduce your Snowflake computation costs
To reduce Snowflake computation costs when receiving data from Amplitude, try these methods:
The effectiveness of these recommendations depends on the frequency with which you export to your Snowflake instance. Remember to use a dedicated warehouse for Amplitude data to prevent performance issues with other Snowflake operations.
- Modify warehouse size and number of clusters for your dedicated Amplitude warehouse. For shorter export cadences (15 to 30 minutes), try starting with
xsmallorsmall. Then upgrade as needed. - When backfilling data into Snowflake, start with a small warehouse and upgrade as needed.
- You can also try reducing the auto suspend time to 60s. This option might not be available within the Snowflake UI, but you can manually set it through a direct Snowflake query.
Snowflake export format
Data locations
Amplitude exports data to your Snowflake account using the following naming convention:
| Component | Format | Description |
|---|---|---|
| Database | DB_{org_id} | Uses your Amplitude organization ID for the database name. |
| Schema | SCHEMA_{project_id} | Uses your Amplitude project ID for the schema name. |
| Events Table | EVENTS_{project_id} | Uses your Amplitude project ID for the events table name. |
| Merged Users Table | MERGE_IDS_{project_id} | Uses your Amplitude project ID for the merged users table name. |
For example, if your organization ID is 12345 and your project ID is 67890:
- Events:
DB_12345.SCHEMA_67890.EVENTS_67890. - Merged Users:
DB_12345.SCHEMA_67890.MERGE_IDS_67890.
Each project can have only one Snowflake Export destination for each data type. This means one destination for events and one destination for merged IDs per project.
Event table
Event table schema
The Event table schema includes the following columns:
| Column | Type | Description |
|---|---|---|
amplitude_attribution_ids | ARRAY | Hashed attribution ids on the event |
amplitude_id | NUMBER | The original Amplitude ID for the user. Use this field to automatically handle merged users. Example: 2234540891 |
app | NUMBER | Project ID found in your project's Settings page. Example: 123456 |
city | VARCHAR | City. Example: “San Francisco” |
client_event_time | TIMESTAMP_NTZ | Local timestamp (UTC) of when the device logged the event. Example: 2015-08-10T12:00:00.000000 |
client_upload_time | TIMESTAMP_NTZ | The local timestamp (UTC) of when the device uploaded the event. Example: 2015-08-10T12:00:00.000000 |
country | VARCHAR | Country. Example: "United States" |
data | VARIANT | Dictionary where certain fields such as first_event and merged_amplitude_id are stored |
device_carrier | VARCHAR | Device Carrier. Example: Verizon |
device_family | VARCHAR | Device family. Example: Apple iPhone |
device_id | VARCHAR | The device specific identifier. Example: C8F9E604-F01A-4BD9-95C6-8E5357DF265D |
device_type | VARCHAR | Device type. Example: Apple iPhone 5s |
dma | VARCHAR | Designated marketing area (DMA). Example; San Francisco-Oakland-San Jose, CA |
event_id | NUMBER | A counter that distinguishes events. Example: 1 |
event_properties | VARIANT | |
event_time | TIMESTAMP_NTZ | Amplitude timestamp (UTC) which is the client_event_time adjusted by the difference between server_received_time and client_upload_time, specifically: event_time = client_event_time + (server_received_time - client_upload_time) Amplitude uses this timestamp is used to organize events on Amplitude charts. NOTE: If the difference between server_received_time and client_upload_time is less than 60 seconds, the event_time isn't adjusted and equals the client_event_time. Example: 2015-08-10T12:00:00.000000 |
event_type | VARCHAR | Event type |
group_properties | VARIANT | |
groups | VARIANT | Group types. See the Accounts documentation for more information. |
ip_address | VARCHAR | IP address. Example: "123.11.111.11" |
language | VARCHAR | |
library | VARCHAR | |
location_lat | FLOAT | Latitude. Example: 12.3456789 |
location_lng | FLOAT | Longitude. Example: -123.4567890 |
os_name | VARCHAR | OS name. Example: ios |
os_version | VARCHAR | OS version. |
paying | VARCHAR | True if the user has ever logged any revenue, otherwise (none). Note: The property value can be modified via the Identify API. Example: true |
platform | VARCHAR | |
processed_time | TIMESTAMP_NTZ | |
region | VARCHAR | Region. Example: California |
sample_rate | NUMBER | |
server_received_time | TIMESTAMP_NTZ | |
server_upload_time | TIMESTAMP_NTZ | Amplitude timestamp (UTC) of when Amplitude servers received the event. Example: 2015-08-10T12:00:00.000000 |
session_id | NUMBER | The session start time in milliseconds since epoch. Example: 1396381378123 |
start_version | VARCHAR | App version the user was first tracked on. Example: 1.0.0 |
user_id | VARCHAR | A readable ID specified by you. Should be something that doesn't change; for that reason, using the user's email address isn't recommended. |
user_properties | VARIANT | |
uuid | VARCHAR | A unique identifier per row (event sent). Example: bf0b9b2a-304d-11e6-934f-22000b56058f |
version_name | VARCHAR | The app version. Example: 1.0.0 |
Event table clustering
The exported events table uses these clustering keys by default (in order):
TO_DATE(EVENT_TIME)TO_DATE(SERVER_UPLOAD_TIME)EVENT_TYPEAMPLITUDE_ID
This optimizes query performance for time-based queries. You can modify the clustering keys to match your query patterns.
Merged User table
Merged User table schema
The Merged User table schema contains the following:
| Column | Type | Description |
|---|---|---|
amplitude_id | NUMBER(38,0) | The Amplitude ID being merged into a user's original Amplitude ID. |
merge_event_time | TIMESTAMP | The time of the event a user's new Amplitude ID was associated with their original Amplitude ID. |
merge_server_time | TIMESTAMP | The server time of the event when a user's new Amplitude ID was associated with their original Amplitude ID. |
merged_amplitude_id | NUMBER(38,0) | The originally assigned Amplitude ID when the user is first created. |
Merged User table clustering
Amplitude clusters the merged IDs table by DATE_TRUNC('HOUR', MERGE_SERVER_TIME). This optimizes queries that filter by when user merges occurred. You can modify the clustering keys to match your query patterns.
Was this helpful?