Data Mutability Features
Amplitude Academy
Event Mutability: Sync Data Warehouse Changes to Amplitude
Learn to sync data warehouse changes to Amplitude.
Get startedAmplitude's Data Mutability features let you keep data consistent between your warehouse and Amplitude by supporting INSERT, UPDATE, and DELETE operations on your event data. This capability is available through Mirror Sync strategies across multiple warehouse integrations, which lets you keep your Amplitude data synchronized with your source of truth.
Data Mutability lets you:
- Insert new events into Amplitude.
- Update existing events with new information.
- Delete events that should no longer exist in your analytics.
This functionality is especially valuable for organizations that need to:
- Correct historical data errors.
- Adhere to data privacy regulations (GDPR, CCPA).
- Maintain data consistency across systems.
- Handle late-arriving or corrected data.
Supported data sources
Data Mutability is available through the following warehouse integrations.
Snowflake
- Mirror Sync strategy with Change Data Capture (CDC).
- Supports
INSERT,UPDATE, andDELETEoperations. - Requires Change Tracking enabled on source tables.
- Learn more about Snowflake integration →.
Databricks
- Mirror Sync strategy with Change Data Feed (CDF).
- Supports
INSERT,UPDATE, andDELETEoperations. - Requires Change Data Feed enabled on Delta tables.
- Learn more about Databricks integration →.
Amazon S3
- Mirror Sync strategy for file-based mutations.
- Supports
INSERT,UPDATE, andDELETEoperations. - Requires structured mutation metadata in your data files.
- Learn more about Amazon S3 integration →.
How Mirror Sync works
When you enable Mirror Sync with data mutability:
Change detection: The integration monitors your warehouse for data changes using native change tracking features (CDC for Snowflake, CDF for Databricks, or file metadata for S3).
Operation processing: Amplitude processes three types of operations:
INSERT: Adds new events to Amplitude.UPDATE: Modifies existing events in Amplitude.DELETE: Removes events from Amplitude.
Amplitude finds matching events based on the combination of
user_id,insert_id, andevent_time. All three fields must match before Amplitude can identify and modify the correct event.Data synchronization: Changes apply to keep consistency between your warehouse and Amplitude.
Enrichment services
Enrichment Services Disabled
When using Mirror Sync with data mutability, Amplitude disables enrichment services, including:
- ID resolution and user merging.
- Property and attribution syncing.
- Location resolution.
- Taxonomy validation.
Disabling enrichment ensures your data remains exactly as it exists in your source of truth.
General requirements
- User ID required: All events must contain a user ID. Mirror Sync doesn't support anonymous events.
- Unique Insert ID: Each event should have a unique and immutable
insert_idto prevent duplication. - Chronological order: Process events in chronological order when possible.
Event volume considerations
Event Volume Impact
Data mutations count toward your event volume:
- Warehouse sources (Snowflake, Databricks): Multiple operations on the same event within a sync window count as one event.
- File sources (S3): Each operation counts separately toward your event volume.
Monitor your usage and contact sales if you need additional event volume.
Data retention
- Snowflake:
DATA_RETENTION_TIME_IN_DAYSmust be ≥ 1 (recommended: ≥ 7 days). - Databricks: Change Data Feed retention must cover your sync frequency.
- S3: Files must remain accessible throughout processing.
Best practices
Keep the following best practices in mind as you enable data mutability.
Plan your implementation
Start with a test project: Create a dedicated test environment to validate your mutation logic before implementing in production.
Design for idempotency: Ensure your mutation operations can be safely retried without causing data inconsistencies.
Monitor data quality: Implement validation checks to ensure mutations apply correctly.
Data privacy compliance
When using data mutability for privacy compliance:
Stop data flow first: Before you delete user data, ensure you send no new data about that user to Amplitude.
Use User Privacy API: For complete user deletion, use the User Privacy API with warehouse deletions.
Verify deletion: Confirm that deleted data no longer appears in your analytics.
Performance optimization
- Batch operations: Group related mutations together when possible.
- Optimize sync frequency: Balance data freshness needs with processing overhead.
- Monitor resource usage: Track warehouse compute costs associated with change tracking.
Migrate to data mutability
If you're migrating from a standard ingestion strategy to Mirror Sync, follow these steps.
Recommended migration steps
Create cutoff strategy:
- Modify the existing connection with a time filter (for example,
WHERE time < {cutOffDate}). - Set the cutoff date to tomorrow in milliseconds since epoch.
- Modify the existing connection with a time filter (for example,
Wait for cutoff: Allow the cutoff date to pass and verify no new data flows through the old connection.
Create new Mirror Sync source:
- Configure the new source with a complementary filter (for example,
WHERE time >= {cutOffDate}). - Enable Mirror Sync with the mutation settings you want.
- Configure the new source with a complementary filter (for example,
Clean up: Remove the old source connection after verifying the new one works correctly.
Common issues
Events don't update
- Verify that change tracking is enabled on source tables.
- Check that events contain required user IDs.
- Confirm sync frequency settings.
Missing deletions
- Ensure DELETE operations are properly configured in your source.
- Verify that deleted events had valid user IDs.
- Check that change retention periods haven't expired.
Data inconsistencies
- Review mutation operation ordering.
- Verify that Amplitude disabled enrichment services as expected.
- Check for timing issues between warehouse changes and sync execution.
Was this helpful?